What is Log Management
What is log management?
Understanding this IT element is essential to nearly all IT businesses and operations.
Understanding this IT element is essential to nearly all IT businesses and operations.
Logs are continuous digital records of events generated by all components of your software stack which includes (but is by no means limited to):
In other words, logs are everywhere. Every supporting service or component of a modern cloud application is logging every action or event that takes place within it. An example of log data from a client machine is found below. Specifically, it includes a timestamp followed by the Hostname, process type, log type, application, action, and TCP socket status.
Dec 10 12:58:39 Client-mac socketfilterfw[183] : openvpn-service: Allow TCP LISTEN (in:0 out:2)
Dec 10 13:01:09 Client-mac socketfilterfw[183] : Dropbox: Allow TCP LISTEN (in:0 out:2)
Dec 11 18:40:03 Client-mac socketfilterfw[183] : Office365Service: Allow TCP CONNECT (in:1 out:0)
Dec 12 17:17:25 Client-mac socketfilterfw[183] : Dropbox: Allow TCP LISTEN (in:0 out:4)
Logs are like an x-ray report of the body. Logs provides visibility into the health of application and infrastructure stack. Without logs the internal operations on most IT components would be inscrutable. The lack of log visibility further creates operational challenges when modern applications are leveraging the cloud, the infrastructure they don’t own, and microservices architecture where three tier architecture is transformed into n-tier architecture with many-to-many communications between those services.
In the event of security incident or operation outage, your DevOps, ITOps, and SecOps team(s) do not have insight that allows them to quickly resolve the issue. This lack of visibility into their stack often creates higher application latency and more system outages, which translates into poor customer experience and customer churn.
Knowing where to look to pinpoint problems that cause customer satisfaction issues, applications to slow down, system-wide outages, or security threats, are primary reason for the existence of logs. Each individual log contains a stream of events, and includes wealth of data about software and related infrastructure performance, availability, user access, and behavior. Happily, in today’s tech-driven world, logs are pretty much ubiquitous. By analyzing these logs, one can proactively detect and resolve issues that impact the business.
Some examples of such log events are:
Install a collector to collect data from any part of your stack. One can collect logs from operating systems, containers, network devices, AWS infrastructure, application access logs, and custom events.Collection can be done using Syslog or applications directly writing the logs in to the centralized log management over HTTP. Schema on read will save lots of time for you since it eliminates need to pre-parse a log before ingesting it into a log management system.
Centralize the logs for easy access and visibility into relevant modules. Centralized logs will ensure that users would never have to hop from one server to the other, and then manually “grep” logs of interest from multiple systems, e.g. to search for a particular string or pattern of text within a log. Indexing allows ITOps and SecOps to quickly search for any term within a log, similar to Google search.
After indexing is done, ITOps can search and analyze information and also allows them to create schema on read. Analysis can be done manually or one can use native machine learning for advanced analytics to identify and compare patterns, or spot outliers.
Monitor and alert is the next phase. Log Management should be able to integrate with commonly used collaboration software such as Hipchat, Slack and PagerDuty to alert users. Continuous monitoring of large volume of data and logs is inevitable, but to ensure that users are alerted in time with dynamic thresholds and the use of advanced analytics powered by machine learning is necessary.
It is important to share reports and dashboards so that the entire team has access to the same data. And added benefit is that you can create these reports and dashboards just once, and then use them many times again without requiring other users to recreate them. Relatedly, it is critical to mention that the use of RBAC (Role Based Access Control) is mandatory to provide need-to-know access to the team.
If logs are so valuable, why can’t we just grep them to find what we are looking for? It turns out, that’s not quite so simple, for a few reasons.
Ideally, you need a centralized log management solution that centralizes all of these logs, correlates, and analyzes them to provide meaningful insights for IT to solve SLA, performance issues, and availability problems.
<< Sumo example of a search query … >>_sourceCategory=config Notification ConfigurationItemChangeNotification| json “Message”, “Type” as single_message, type| where type == “Notification”| json field=single_message “configurationItem” as single_message| json field=single_message “resourceType”, “configurationItemStatus”, “awsRegion”| where configurationItemStatus = “ResourceDeleted”| count by resourceType
In above example, query is counting total number of deleted resources as reported by AWS Config.
In order to draw meaningful insights from these logs that are everywhere and ever-growing, you need a scalable platform that centralizes all these logs, provides a simple search interface for users to look for common exceptions, applies machine learning to detect patterns in behaviors, and helps users with insightful information to not only reactively fix the issues but also to prevent them from recurring.
Some common use cases of log management solutions are to enable developers, IT operators, and security professionals include,
Sumo Logic is a cloud-native secure centralized log analytics service, which provides insights into logs through pre-built applications, identifying patterns to show outliers in behaviors of applications and systems. IT teams can then instantly act on these outliers, get to the root cause and prevent any future impact to the business
Sumo Logic can collect logs from almost any system in nearly any format, and our centralized log management service analyzes over 100 PB of data on an average day!
Sumo Logic provides everything you need to conduct real-time forensics and log management for all of your IT data without performing complex installations or upgrades, and without the need to manage and scale any hardware or storage. With fully elastic scalability, Sumo Logic is a fit for any size deployment.
The following table lists data types and some of the more common sources that produce logs and which can be collected by Sumo Logic. This list is a sample only to provide a general idea of the possible sources of log data; it is by no means complete. For more information on how Sumo Logic can help you manage log data, please visit the Sumo Logic application page.
Technology | Popular Log Sources |
---|---|
Open Source |
|
Middleware |
|
Databases |
|
Server / OS |
|
Virtual |
|
Network |
|
Content Delivery |
|
IaaS / PaaS |
|
SaaS |
|
Security |
|
Integration with Custom library |
|
AWS Services |
|
Modern Technology |
|
“Before Sumo Logic, we thought that real-time monitoring coupled with the ability to unify log and metric data in a scalable SaaS-based platform was an unattainable goal. Not only does it exist, but Sumo Logic is setting the gold standard.”