Be careful what you wish for…
Auto detecting new errors, exceptions and bugs in log4j logs
XpoLog’s Analytic Search has been around for quite a while, but our latest version brings with it many new gadgets so you had better be careful what you ask for, because XpoLog’s unique search engine not only finds what you are looking for, but also all the things you never even thought to look for in the first place, and serves it to you with a complete analysis.
In this series of posts I am covering some of the ways you can benefit from XpoLog 6’s new features and enhancements and especially on how to get the most valuable information from your log4j event logs. By running Analytic Search on your log4j data, you can measure your application performance and thread activity, create your own Apps for better monitoring, measure code activity with class and method analytics on log4j, build security analysis, and create dashboards, charts, slide-shows, and make use of other visualization gadgets for maximum analysis.
In this post I will show you the basics of XpoLog Analytics, how to auto detect new errors, exceptions and bugs in log4j logs, and discover unknown messages. To read our whole hands-on-guide see “Getting the maximum from your log4j logs”. If you want to follow as you read along you can download our software for free here.
From Search to Analytics
In my previous post about AppTags, I discussed Simple and Complex Search using XpoLog’s powerful search engine. Already at this stage, XpoLog suggests analytic insights you may be interested in investigating further.
To put it simply, while XpoLog’s Search Engine gives you everything you asked for, XpoLog Analytics gives you everything else.
If you are searching for a string, a thread, an error, etc. in one or any number of logs, folders, applications, or servers, within a given time frame, XpoLog’s search engine will find and display all cases/events of the search request within all logs. But XpoLog will also open the door for any other abnormality that may occur in these logs within this time frame, and this brings us to Analytics. In other words, as soon as you conduct a search, XpoLog will already automatically present you with all other issues that you did not even know existed, be it errors, exceptions, unknown messages, or any other anomaly.
As an example, look at the simple search we did (in my previous post) looking for all log4j logs where the priority was ERROR. Quick recap: Inside XpoLog Center, on the Search page, in the search field, we typed:
priority=error in log.log4j server*
The result looked like this:
Below the graph, XpoLog displays all the events where the priority=ERROR. But in the side bar (see red rectangle), XpoLog has already suggested Analytic Insights, such as ERROR, java.io.EOFException, … and the list goes on. These Insights may not necessarily appear in any of the events where priority=ERROR, but they do appear somewhere in these logs, and hence may indicate that something went wrong somewhere.
So already, at this stage when all you are doing is searching for something, XpoLog is already several steps ahead, analyzing, and inviting you to dig deeper to find the root of the issue. From the Analytics Insight list, we can select one or more insights and either add them to the search, or use them to replace the search. We can then investigate the matter further, in XpoLog Analytics.
Inside XpoLog Analytics
The screen capture below shows the Analytics page. The top section has a graph showing you the data distribution and the maximum severity of the events over the selected time-span. Below the graph is a table showing the logs and folders in which these events were found. Below this section is another table showing the 10 most severe errors that were found in any of these logs and folders.
When listing the logs, XpoLog lists the logs containing the problems with the high severity first (red), then all those with medium priority problems (orange), and lastly, those containing low priority problems (green). XpoLog decides the severity level according to the highest severity anomaly found in the event. You may be searching for an anomaly with a medium severity, but if, in the same event, another anomaly with high severity is found, the event as a whole will be marked as high priority.
In the screen capture above, a search was conducted for Failed to initialize hudson, which has medium priority, but within the same event, XpoLog found a hidden message, java.lang.OutOfMemoryError, which has high priority, thus bringing the entire event up to high priority.
From the initial Analytics page, which by default shows the total summary of all anomalies in all logs of your search, you can drill down for more specific details. For the sake of our example, let’s drill down into log4j:
The Analytics page has now drilled down to the log4j level (see screen capture below). You can see the number of anomalies has been reduced, as has the amount of data being depicted. The first table below the graph now contains only folders of the log4j applications and the second table shows the most severe log4j problems found.
Log4j Use Cases
Let’s have a look at a potential use case. Inside log4j is a tomcat folder.
A user is complaining that tomcat will not start. We don’t know what the problem is, so the easiest way to find out is to do a general search for anything abnormal going on in tomcat in the given time frame when the user was unable to start it.
Inside XpoLog Search, in the search field, we type the following query:
* IN folder tomcat 8
In addition to the requested search results, XpoLog suggests many more analytic insights. There could be many reasons why tomcat did not start, so from the Analytics Insight list, we will select java.net.BindException and replaced the * search with this query:
“java.net.BindException” IN folder tomcat 8
The screen capture below shows how to replace the existing search query: right-click on the insight and then select Replace search:
The search result for “java.net.BindException” IN folder tomcat 8 will look as follows:
We now have completely new events in the list and we see that for the insight “java.net.BindException” that the “address is already in use”. This is why the user was unable to start tomcat.
In another example, we created an App called hudson. Hudson worked for a while, but at some point it stopped. We did not know why, so we did a search for hudson. In the search field, we typed:
hudson IN folder tomcat 8
We got the following results:
Here we can see in the Analytics Insight list that there is a high priority error: Java.lang.OutOf MemoryError. In this particular example, this error occurs in the first event on our list of events containing “hudson”. We now already know why hudson stopped working. By zooming in and hovering over the graph, you also get a presentation of the high, middle, and low priority errors per moment in time.
So we can see from these examples and use cases that while users are simply searching for known anomalies, or even searching for any anomaly without pre-knowledge, XpoLog is already several steps ahead analyzing everything else about the logs in the search that the user never thought of. In the fraction of a second it takes XpoLog to do the requested search, a complete list of analytic insights are also created, waiting for the users and inviting them to dig deeper into their logs, folders, apps or servers, to get to the root of whatever is causing them trouble.
So by now you should have figured out why we call our searches “Analytic Search”… ☺
In my next post I will take XpoLog’s Analytic Search a step further and show you how to check your application’s performance and availability. Stay tuned, or go directly to “Getting the Maximum from your log4j logs”.