Monday, April 4, 2016

Domain Driven Design: a "hands on" example (part 3 of 3)

Finally, we get to the last part of this series. Let's dig into some code example for the "Product Catalog" context. The code showed here is just for tutorial purposing and should not be used in real production environment. If you decide to use it, you do so at your own risk. The target here is to expose how it is possible to translate some of the discussed concepts into code.

Logical layering

First of all, I would like to introduce the logical layered architecture that I like to use in my applications. It is a kind of "relaxed layer", pretty simple, like this:


Picture 1: Application logical layered architecture

Notice that when I say "my applications", I mean the business layer, the part of the system that solves my business problem. Other layers like view, interface and utility layers are not considered. 

In this manner, the "Application" layer acts like a Facade to the model so that it is not exposed to other external layers. The "Model" layer is where the model reside itself, while the "Repository" layer is in charge of persistence, whatever the type of the data source has been used (database systems, files, etc.).

The DTO (Data Transfer Object - "Patterns of Enterprise Application Architecture" Fowler, Martin) layer, is the way my application exposes data. I like to use it so I don't have to expose my Domain objects and its interfaces, increasing the overall level of low coupling in my system.

Code view: the Model layer

This is the Model for the "Product Catalog" context from the previous post:


Picture 2: "Product Catalog" context model

I tend to write my classes pretty close to what I modeled. Why? I think it is the gist of applying Domain Driven Design: write your code as close as possible to your model and, therefore, your business. You always end up adding or modifying something when you are coding and materializing your abstract model. Totally normal.

Let's take a look at specific parts of the "Product" class and its implementation. The first one, the constructor:


Picture 3: "Product" class "Constructor" method

What is important to notice here is that this constructor guarantees the least amount of information necessary for creating a new "Product" instance. In my model, it would not make any sense to have an instance of "Product" without an "id", an "inventory code" and a "title". Provide me at least these and I will give you an instance of "Product" in the status "Registered", so you can work on it further. That is the idea.

Now, let's take a look at some methods:


Picture 4: "Product" class "ChangeTitle" method

"Title" is a mandatory data. There is no way to create an instance of "Product" without providing it. On the other hand, it is possible to change and modify it. Can you figure out the semantics difference in the method name? The method could be simply named as "SetTitle" (a very common approach) indeed.

However, it is pretty much intuitive to call it "ChangeTitle", once it cannot be really set. As it is needed when you created the instance, it would not make sense to set it. It might seem not important at all, but this subtle naming approach is what makes your implementation closer to your real business rule in this case.

Furthermore, the method assure that only valid changes take place. Any other kind of business validation regarding "Title" (like maximum size allowed, not allowed words, etc.) would fit well in this method.

Let's see another one:


Picture 5: "Product" class "Activate" method

The "Activate" method is interesting. A product should be only activated if it attends to several business requirements. These requirements are tested in the method. If all of them are good, the status of the instance is changed from "Registered" to "Active". This is your business logic being handled by your model instead by other layers of your application.

Code view: the Application layer

As I said before, the Application layer works like a Facade to the Model, so that I can avoid high coupling of my Model to other layers. The Application layer usually is used to support business transaction rules. It is used to coordinate operations executed on the Model and can provide general application support as well, like authentication, security, integration and so on.

I like to see my Application classes as a kind of "Transaction Script" - "Patterns of Enterprise Application Architecture" Fowler, Martin - coordinating model classes in order to achieve some business transaction, as explained above.

As an example, I created a "CatalogService" Application class in order to support business transactions regarding the "Product Catalog" context:


Picture 6: "CatalogService" Application Class

There are three important things that should be noticed in this class. The first one, it does not expose the Model. It works only with native data and DTO interfaces (CatalogItem is a DTO).

Second, it provides only real business transactions methods, all pertinent to the "Product Catalog" context. There is no such generic method like "UpdateProduct", which would transfer any business transaction responsibility to its caller.

Third and last, it can be used in many different situations in my system. I can build any graphic interface layer and couple it to this layer. I can build a service layer (Rest, SOAP, etc.) and put it in front of this layer (a way to expose business logic remotely). I can build integration with other contexts through the Application layer.  All of these aspects contribute in order to have an easily maintainable and robust business application.

Let's take a look at some of these methods in detail:


Picture 7: "CatalogService" class "ProvideDimensionAndWeight" method


Picture 8: "CatalogService" class "ActivateProduct" method


Picture 9: "CatalogService" class "GetCatalogItems" method

The first two exemplified methods (ProvidDimensionAndWeight and ActivateProduct) follow a very similar pattern: parse data, query the repository, coordinate on the model and persist object's state. The last one, "GetCatalogItems", has a different pattern: it basically queries the repository and returns data. Persistence techniques are out of the scope of this post. However, it is a good opportunity to talk about some DDD concepts regarding persistence.

"There is no silver bullet". Domain Object Models are good to organize and tackle business complexities. On the other hand, they might not be good for data retrieving in many scenarios. For example, let's take a look at the "CatalogItem" DTO:


Picture 10: "CatalogItem" DTO class

It contains only the necessary data in order to build a product catalog list (in a web page, for instance). Once I modeled the "Product" class as an Aggregate root, it must be loaded as a whole from whatever the data source it is persisted. Because it was built to support and tackle business rules and complexities, its parts, individually, do not make sense.

Now imagine that you have thousands of products and some millions of users. Would you think it would be a good idea loading all "Product" instances for simply getting data in order to build and show a catalog of products? Definitely not.

The gist of this question is: when you don't need to treat any business rule (as when you just need to show some data), you should avoid the overhead that your Model persistence normally adds. That is why the method "GetCatalogItem" goes toward to the repository: it only needs data.

Running the code example

In this post, I have covered what I consider the most important concepts. However, if you want to explore a little bit more, the complete code example can be downloaded here. It is a VS 2012 Solution and contains a unit test project, where you can find some unit test methods that you can use for running and debugging and check how the business model has been tested.

Conclusion

In this complex universe of Software Engineering, Programing and Computer Science as well,  I tried to share with you guys part of my experience and knowledge regarding "What DDD is really about" and "How to approach your problem in a DDD fashion".

I really believe that no one is the owner of truth and the "trade of" law is always present. It is valid to keep yourself informed, reading books, articles, etc. Therefore, never do something just because everybody has been doing. Never lose your critical sense. I hope it has been helpful for someone :)


Saturday, February 27, 2016

Domain Driven Design: a "hands on" example (part 2 of 3)

I am glad that this series introducing Domain Driven Design has pleased you readers. Thank you guys for all the positive feedback I have received. So, let's start the second part!

Just to remind you, I ended up with the following solution in the first post of this series:

Picture 1: A DDD approach to the e-Commerce Domain

I am going to pick only two Bounded Contexts as examples for modelling. As I said before, some of these contexts can even be supported by off-the-shelf applications, as the intent here is not to develop the whole solution basically from scratch. I thought it would be a good thing if I worked on the contexts of  the "Product Catalog Website" (the Core Subdomain) and a Support Subdomain (my choice was the "Orders" one).

Concepts for creating Domain Class Models in a DDD fashion

First of all, I would like to state here that any OOD (Object Oriented Design) know-how is useful for constructing Domain Class Models, it doesn't matter if you are a DDD enthusiast or not. For example, I like to use concepts from, GRASP (General Responsibility Assignment Software patterns - Craig Larman) and GOF patterns when I am creating my models. A solid understanding of UML is very useful too. Let's take a look in some core concepts that you must know when modeling a Domain Class Model.

Entity

It is possible to find many definitions for "Entity" in literature. It can represent many different things. However, for DDD, the meaning of Entity is very clear. Vernon's book, "Implementing Domain Driven Design", has an excellent definition for Entity:

"We design a domain concept as an Entity when we care about its individuality, when distinguishing if from all objects in a system is a mandatory constraint. An Entity is a unique thing and is capable of being changed continuously over a long period of time."

Mutability and unique identity are the two main characteristics entities have.

Value Object

Your capacity to identify Domain concepts and model then as "Value Objects" is one of the most powerful tools you can use in order to succeed in creating DDD models. Why? Well, if you are not able to identify them, you tend to model everything as an "Entity".

It is hard to portray all the nuances of it in a post as short as this one. Let's try to make it clear in the class model (further). For now, keep in mind that "Value Objects" do not need to be treated as unique. Normally, they are immutable and their methods (when available) should provide a "Side-Effect-Free" behavior (not changing the object state).

Aggregate

Think in an Aggregate as a block composed for different pieces and, even that these pieces might exist by them self inside your context, it would not make any sense using them separately. Let's see this again in the class model section.

Modeling

Well... it is time to start modeling and I have one important tip: when modeling, FORGET about data models (relational models) and do not think about persistence! It can really harm the way you see your modeling scenario. If you want to succeed in applying DDD, this mental approach is necessary.

These are just hypothetical models which I created in order to support our example. Please, do not expect fully functional models!

The "Product Catalog" context model

Based on everything that has been discussed above, this is my solution for the "Product Catalog" context:


Picture 2: The "Product Catalog" context model

Now, let's dig into the details of this simple class model. First of all, it is possible to notice that there is only one class modeled as "Entity", regardless of the fact this model is a small one. Why?

The Classes "Weight" and "Dimension" are very similar: both represent a value and how it should be interpreted. If you look at these classes' methods, you will notice that there are no "set" methods. An instance of "Weight", for instance, should have all of its attributes set when it is constructed. In order to achieve this, I might use a Constructor with all the parameters or a kind of "Creator" method.

Remember what was said about "Value Object" above? I don't need to change its data once it has been created. It works well being immutable and consequently allows me to be free of all the complexities related to "Entities".

In the case of the class "Review", well... I believe most people would model it as an "Entity". But I think it fits well as a "Value Object". Its relationship with "Product", does not have a strong meaning inside the context. A "Product" does not depend on "Review" to be available in the catalog. Once a instance of "Review" is created, I do not see any reason to change it. So, I decided to model it as a  "Value Object" too.

Finally, the "Product" class. It is an "Entity" for sure because I need it being uniquely identifiable. Also, I must be able to change its state and might even want to track changes on it. It is an "Aggregate Root" too because it aggregates both "Shipping Weight" and "Dimension" in a composition relationship.

The real meaning of aggregation here is: you should see those objects as inseparable. For example: once a instance of the class "Weigh" is created and set to a instance of "Product" as the "Shipping Weight", every time you load "Product" from your persistence layer, it should contains the "Shipping Weight" too. That is why you should be careful when modeling aggregates in order not to create monsters which will bring you problems (probably related to persistence complexity and performance).

Now the "Product" class methods. Can you explain why it is apparently missing some set methods? Why doesn't it have a "get" method to return the list / array of pictures? What the hell are those methods "activate()" and "deactivate()"?

In order to create an instance of "Product", you must supply at least the product ID and the inventory code (product identifier coming from another bounded context, the inventory context). This data is a kind of immutable data and it would not make any sense to allow a "Product" instance creation without it.  Before you ask, I prefer not to use database features like identity columns, sequences and so on for ID generation. This way, I've got my "Entity" classes not dependent on persistence in order to get an unique identifier.

Moving on, I modeled "Product" having the following "Status" state machine:


Picture 3: The "Product" class "Status" state machine

This way, the system would support a product registration for future data input without resulting on it automatically appearing in the catalog.

Without exposing the list / array of pictures, the "Product" class has total control over any business logic regarding product pictures. Maximum number of pictures, position, duplicated pictures, etc; all these kind of rules can be managed and ensured by the expert class. So, I've got more cohesion and, in some cases, low coupling too.

The method "activate()" is crucial: it must guarantee that, before changing the state of an instance to "Active", all required data have been provided and there is no violation regarding machine status flow. The "deactive()" method should play a similar role on treating its pertinent business rules.

In this manner, applying all of these concepts, you avoid what many authors call of "Anemic Domain" (Fowler, Martin) once your classes provide real business functionality and not only data.

The "Orders" context model

All concepts previously discussed are valid for the "Orders" context too. So let's take a look at the model:


Picture 4: The "Orders" context model

The "Order" class is an "Entity" and "Aggregate" by same reason explained above (in the "Product Catalog" model). It has methods to control all its pertinent business rules like, "cancel()", "close()", "calculateTotal()" and so on; it controls how new instances of "Item" are created, added and removed. Therefore, we can say that it is an expert and has high cohesion.

The "Product" class here represents a totally different concept. It basically represents an "Order Item". I modeled an interface "Item" just to add a little bit more of "low coupling" to the solution. Imagine that product data will come from other context ("Product Catalog" context), but the "Order" application logic should be unaware of this.

In this manner, application classes interacting with the "Order" class don't need even know about the existence of the "Product" class, even though it has a different meaning and its own context. I haven't talked about application logic classes before and I will explain it in the last part of this series.

Conclusion

I tried to be succinct and clear as much as possible in order to share how I usually think when I model using DDD concepts. I hope these tips and insights might be useful for you guys. I would love to be in touch by email, comments, etc. with anyone who wants to discuss about this subject. In the third and last part, I will show some code examples. Best regards!

Related posts:
Domain Driven Design: what is it really about? 
Domain Driven Design: a "hands on" example (part 1 of 3)

Friday, February 12, 2016

Domain Driven Design: a "hands on" example (part 1 of 3)

I have received feedback from some readers of my last post "Domain Driven Design: what is it really about?". Some of them mentioned that it is pretty difficult to get the hang of it, once DDD concepts seem to be very abstract. I must admit: it was not easy for me! Therefore, I believe there is no one who is the owner of the truth. I am still learning... and probably it will never end.

This way, my goal here is just to try helping others to understand some of the core concepts of DDD and how to apply it. Please, do not expect any kind of good practices handbook coming from this post! I will share with you guys part of my knowledge and experience on approaching software modeling problems using the DDD philosophy.

The problem

I will pick an example from an "e-Commerce" System. You will find a pretty similar one (but not evolved as the one presented here) in the book "Implementing Domain-Driven Design", by Vaughn Vernon (which I recommended in my last post). Vernon's book can be a little bit hard to follow and understand at first, so I recommend that you fight yourself and read the three first chapters. There are a lot of concepts and things that seem to be weird and confusing, but might become clear at the end.

So, let's go to our hypothetical example. You have to build a whole new system in order to support the e-Commerce operation of the company where you work.  Your solution must include:
  • Product search, Catalog: users on your website must be able to search and see information about products.
  • Orders, Payment and Delivery: your customers should be able to place orders, pay for it and receive their goodies.
  • Inventory: your solution must provide inventory control for each product that is available to sell.
  • Authorization, Authentication: your solution must be able to identify and authorize valid users; the same for customers.
  • Customer management: your solution must be able to manage customer registries.

A non DDD approach to the problem

Given the information above, without applying DDD, we could approach the problem this way:

Picture 1: a possible solution (not a Domain Driven Design one) 

I have used that package notation but it is not important. It could be anything, such as drawing circles in a white board. The important thing to notice is: this solution is intended to tackle the problem using a single Bounded Context.

This way, you will have to use all your knowledge of OOD (Object Oriented Design) in order to construct a Domain Class Model that supports all these business  challenges. This task can be very, very hard! It would be necessary to represent many business requirements and behavior in a single Domain Model. Believe me: it can be a pain in the ass even for those more experienced professionals.

I will not dig in architecture details about how this system / application could be built. The fact is: we have only one Domain Class Model and we can develop a single application for this. Have you noticed what Bounded Context means here? Having only one Bounded Context means: a single view of the problem = a single Domain Class Model to solve all that business challenges. We are going to talk more about Bounded Contexts further in this post.

Applying DDD

Let's start applying some DDD to this problem. We should start identifying possible Subdomains inside our Domain. What the hell does it mean? Basically, a Domain is the business itself. It is what your company do and, consequently, it is the "problem" you want to solve. As we have to build an e-Commerce System, our Domain is the e-Commerce business. A Domain has its own strategic challenges which can be seen as Subdomains.

In other words, a Domain can be split in several Subdomains (a small specific view of part of the problem). Each of them can be classified as Core, Support and Generic. Once you have split your Domain in Subdomains, it is a good practice and actually desired that you set a Bounded Context for each Subdomain. I have explained concepts like "Bounded Context" and "Ubiquitous Language" in the post which has led me to this one (see the link on top), but I will try to reinforce now. 

According to Vernon's book (see the link on top again), the following description would be a good definition for "Bounded Context":

"A set of specific software models, a specific solution expressing its own ubiquitous language.
It is a desirable goal to align Subdomains one-to-one with Bounded Contexts.
Does not necessarily encompass only the domain model. It often marks of a system, an application and / or a business service."

This way, my sense tells me that a good solution for the problem, applying these DDD concepts, would be:
Picture 2: DDD approach to the problem

As it is possible to observe, I took strategic business challenges which are pertinent to the Domain and figured out some Subdomains. For each of them, I set a Bounded Context and classified it accordingly (Core, Support or Generic). Usually, there is only one Core, that represents the main part of the Domain. In this case, I think that the "Product Catalog" Subdomain is the Core because it is what customers will be interacting and, therefore, is from where the revenue will come (customer shopping). Others Subdomains were classified as Support and Generic.

Support Subdomains are like auxiliary ones. In practice, you set a Bounded Context to it and create a specific application. It will work as a support application for the Core Domain application, however, support applications will have its own model.

Generic Subdomains are like support ones, but they have a strong particularity: they are so generic solutions that they could be used not only in the Domain it was created, but they could be used by others Domains too. It is completely reasonably that your "Acces Control Application", when well designed, can be reused to support other Domains which not the e-Commerce, for instance. That is why they are called Generic.

It is not difficult to realize that I have just organized my ideas about how I intend to approach the problem, and I have used DDD concepts in order to accomplish that. So, I ended up realizing that I would get a bunch of applications to develop. Perhaps, for some of these applications (like the Inventory one), an off-the-shelf software application would fit well. Also, it is perceptible that and integration between all these applications would be necessary.

If you imagine a real life scenario in a big e-Commerce company, it is possible that more than one team is allocated to support this operation. Possibly one team per Bounded Context.

DDD offers a way to classify and treat relationships between Bounded Contexts. If you have heard terms like "Partnership", "Shared Kernel", "Customer-Supplier", "Conformist", "Anti-corruption  Layer", "Open Host Service", "Published Language", "Separate Ways" and "Big Ball of Mud" (my favorite one), you probably know what I am talking about. The details of these concepts are out of scope of this "hands on"  example, but all of them can be found in the recommended book above (see the link on top).

Conclusion

We have just seen a problem description, a very simple and traditional way to approach it and a Domain Driven Design approach too. In this manner, we figured out that we will have several applications, very specialized applications, and that it is going to be necessary some effort in order to integrate them. In the next post, I will explore more technical examples regarding the Domain Class Model of some of these applications. For each Bounded Context, we should build a Domain Class Model. It is there that your OOD (Object Oriented Design) knowledge shines :)

Related Posts:
Domain Driven Design: what is it really about?
Domain Driven Design: a "hands on" example (part 2 of 3)

Saturday, January 30, 2016

Domain Driven Design: what is it really about?

According to the "Implementing Domain-Driven Design" book (by Vaughn Vernon and foreword by Eric Evans - one of the precursors of DDD), the core of Domain Driven Design philosophy is not about technical stuff.

What do you mean??? That was the first thought that came to my mind when I read it. Let's dig a little bit more in this subject in order to figure out what the above information really means.

So... What DDD is about?

DDD is about understanding how your domain really works. Identifying the core, support and generic Subdomains will make it possible to build well designed models. A Domain, in a broad sense, is the business where you or your company are in. The emphasis of DDD is to identify Subdomains which together compose the whole Domain. Once you did it, you can develop your models inside of Bounded Contexts

The benefits

Let's think about a business Domain of an e-commerce website (a pretty regular one). Without DDD approach, we might start thinking in a single Domain Model to support all the business requirements, which might include Product Catalog, Orders, Invoicing, Shipping, Inventory, etc.

If you have some hands on experience on modeling and programming a Domain Model for a system as described above, you know how can be difficult and extremely complex to define and keep a single class "Product" that support all those business requirements! Here is where the DDD philosophy shines: you don't need to go this way! Getting in the DDD philosophy, we should identify Subdomains inside the core Domain.

This way, Subdomains could be arranged so that each one of the business requirements mentioned above represents a Subdomain. For each identified Subdomain, you should set a Bounded Context. A Bounded Context, basically is a context where your classes, entities, etc. have a unique meaning and are in conform with a Ubiquitous Language.

Again, being very simplistic, a Ubiquitous Language means that there must not be divergence about any concept inside a Bounded Context. This way, it is possible and completely reasonable to have, for instance, different classes (with different meanings) using the same name but in different Bounded Contexts.

A class representing the abstraction of "Product" can exist inside the "Product Catalog" Bounded Context, meaning an item available for searching and selling and having its own pertinent attributes and methods. At the same time, another class "Product" can exist inside the "Shipping" Bounded Context, meaning what must be delivered and having its own specific attributes and methods too.

When you model thinking in a small and very specialized part of the problem, it is much easier to get a simpler, reliable and robust model!

The trade off

Of course, "there's no free lunch" and the trade off law is always present. Having "N" Subdomains, Bounded Contexts and consequently different models, adds complexity to the system as whole. In order to have a functional system able to support a real e-commerce operation, it would be necessary integrate your models and make all these parts working together.

For this, DDD offers a set of patterns and ways to treat and classify your Bounded Context relationships,  like "Partnership", "Shared Kernel", "Customer-Supplier", "Conformist", "Anti-corruption Layer" and others. All of these are concepts regarding how you see your problem and how you organize it. Therefore, the fact is: it is going to demand more attention, time, complexity and work regarding integration between Bounded Contexts.

DDD Concepts Misinterpretation

Unfortunately, there is still a lot of misinterpretation of these concepts. For example, I've seen people publishing tutorials on the internet referring to a layer used to decouple the application front-end layer from the application business layer as the "Shared Kernel". Well... according to the DDD concepts, a Bounded Context might even be an independent system. "Shared Kernel" is a technique to share Model Concepts (see Ubiquitous Language above) between Bounded Context, not between layers.

In the "Product" class example above, if it had the exact same meaning (same meaning, attributes, methods, etc.) for more than one Bounded Context, it could be treated as a "Shared Kernel". Domain Drive Design concepts and philosophy might be a little bit hard to understand at first. That has lead people to misinterpret some concepts and consequently causes mess.

Conclusion

I hope I have been able to convey and explain the very basic and real meaning of "DDD Core philosophy". As explained, it is much more about how to approach your problem instead of technical stuff. I strongly recommend the reading of the book "Implementing Domain Driven Design", by Vaughn Vernon, for those who want to master the Domain Driven Design concepts.



Wednesday, January 13, 2016

Parsing IIS logs with Logstash, sending data to ElasticSearch and analyzing in Kibana

Nowadays, for most modern web applications, it is very important to have information about requests coming to backend servers in order to know if your application is performing as expected. To be able to rely on historical data of your application's API (like the number of requests grouped by resource type) can be very useful.  When you run on a cloud environment like AWS, you do not need to worry about that because they can provide all such information. However, if you are running on your own infrastructure, well.. you should find out a solution by yourself. This is what has led me to the subject of this post.

Bellow, we can see a general deployment view in my case:
Image 1: general deployment view

The main idea is to get Logstash installed in each middle tier server parsing IIS logs and sending this parsed data to Elastic Search. Further, this data can be queried using Kibana in order to produce graphs and useful information about what happens with your application.

Now, we are going to see how I have achieved this. The purpose here is to share what I needed to do without going deep in details, therefore giving you the necessary background to accomplish this by yourself in your particular scenario.

IIS Logs: Installing  IIS Advanced Logging and configuring it

I have chosen to use Advanced Logging due to its extra fields and configuration capabilities. It can be installed using the IIS "Web Platform Installer" option (found in the IIS Services Manager panel interface, server selection) or it can be downloaded at:


Once you install it, don't forget to disable the default IIS logging. You don't want IIS generating two different log files. It is possible to do it clicking on "Logging" => "Disable" (in the right of IIS panel). This same way, you should enable Advanced Logging by clicking on "Advanced Logging" => "Enable Advanced Logging". After that, it is possible to tweak some configuration in order to keep the log file format as you want:

Image 2: IIS Advanced Logging configuration

Most settings are ok as default. The main concern should be about the order of fields in the log file (which you can set using the options presented in the image above) and which fields should be part of the log (clicking on "Select Fields..." button will bring up the following window):

Image 3: IIS Advanced Loggin configuration - Select Logging Fields

It can be "a pain in the ass" to configure it for more than one server. Not surprisingly, it is possible to obtain what you have done in one server and replicate to others. IIS 7.5 on Windows server 2008 stores these configurations in the applicationHost.config, which can be found in the path "c:\windows\system32\inetsrv\config". Inside the "<system.webServer>" configuration section of this file, you can find the "<advancedLogging>" element. Copy and replace it in another server's file.  I have provided my configuration section in this .txt file. This way, if you want, you can use my Logstash configuration file too (further in this post) and save some time.

Logstash: Installing and configuring

Logstash will be in charge of getting data from IIS log files, filtering and parsing and sending it to Elastic Search. Logstash can be downloaded from the Elastic company website. Details about how to install Logstash 1.5 version (this is the version used in my solution) can be found here. After getting Logstash installed, you should create and configure your Logstash .conf file. It can be named as you want and should be passed to Logstash starting comand with option "-f" (see more below). Basically, a Logstash configuration file has three sections:

input section

You can use this section to set the origin of Logstash incoming data. A typical configuration for getting data from IIS log files is like this:

input {
  file {
    type => "IIS Advanced Log"
    path => "C:/inetpub/logs/AdvancedLogs/*.log"
  }
}

filter section

Basically, it is used for processing and filtering data.

output section

You can use this section to tell Logstash what it should do to the processed data, like sending it to Elastic Search Server, for instance.

There is a lot of information on Elastic website about how to configure and take advantage of all the features that each of these sections can provide. As I said before, I have shared my .conf file. If you use it in conjunction with the given configuration for IIS Advanced Logging (as exposed above), you should be able to get this working just like me.

In order to keep Logstash working as windows service, I used Nssm - the Non-Sucking Service Manager. It has worked properly.  Bellow, a configuration example of how a script can be built to install Logstash as a windows service:

@echo off
set nssm_path=c:\nssm\nssm-2.24\win64
set logstash_path=c:\Elastic\logstash-1.5.2
echo.
echo Instaling logstach as service...
echo.
echo Expected logstach path: %logstash_path%
echo Expected nssm path: %nssm_path%
echo.
echo.
cd %nssm_path%
nssm install logstash %logstash_path%\bin\logstash.bat
nssm set logstash AppParameters -f %logstash_path%\bin\[your logstach config file] -l %logstash_path%\log\logs.log
nssm set logstash AppDirectory %logstash_path%
nssm set logstash AppEnvironmentExtra "JAVA_HOME=%JAVA_HOME%"
nssm set logstash AppStdout %logstash_path%\nssm\stdout.log
nssm set logstash AppStderr %logstash_path%\nssm\stderr.log
REM Replace stdout and stderr files
nssm set logstash AppStdoutCreationDisposition 2
nssm set logstash AppStderrCreationDisposition 2
REM Disable WM_CLOSE, WM_QUIT in the Shutdown options. Without it, NSSM can't stop Logstash properly
nssm set logstash AppStopMethodSkip 6
REM Let's start Logstash. I assume a correct configuration is already in place
REM net start logstash

After that, a new "logstash" windows service name will be available in your server. Type "net start logstash" in command prompt window running under admin privilege and the service should be started:

Image 4: starting logstash service after setting it using NSSM.

Kibana: the final step to get done!

Installation of Kibana is easy: just follow the available instructions on Elastic website. Once it is installed, it is reasonably easy to get your reports. As I said before, the target here is not to teach you how to use Kibana or even Logstash. You can take a look on tutorials and documentation on Elastic website (Kibana User Guide). Bellow, just an example of what Kibana can do for you:


Image 5: Data visualization on Kibana


Conclusion

Using Logstash, Kibana and NSSM, it is possible to have a service analyzing your IIS log data, processing it and building some cool graphs :)