Group or Aggregate Operators Overview
Aggregating (group-by) functions evaluate messages and place them into groups. The group
operator is used in conjunction with group-by functions.
Only search results that have been aggregated using a group or aggregate operator can be placed on a dashboard panel. See Chart Search Results for information about charting.
Overview
When using any grouping function, the word by is sufficient for representing the group operator. The typical construction when using group-by functions is:
grouping_function by <fieldname>
The withtime
, most_recent
, and least_recent
operators are not considered standalone operators; they are designed to only be used as an alternative to the first
and last
operators in auto refresh dashboards or any continuous query where first and last are not supported.
By default, the ordering is not defined inside of groups created using a group-by expression. To order your results, use the sort
operator.
Syntax
... | group_by_function <field_to_operate_on> group by <field_to_group_by>[, <field2>, ...]
You can use by instead of group by so count group by user
is equivalent to count by user
.
Rules
- Cannot be used with the LogReduce operator.
- When parsing and naming (aliasing) fields, avoid using the names of grouping functions or other operators as field names.
- When using count, or any grouping function, remember to include the underscore before the field name (sort by
_count
). - Multiple aggregation functions can be on the same line, but you cannot include another function, such as a math function, on the same line of a query.
For example, you cannot use:
... | avg(x + y) as average, sum(x+y) as total
You would need to do that in two separate steps, such as:
... | x + y as z | avg(z) as average, sum(z) as total
In another example, you cannot use:
avg(abs_latency)/1000/60 as avg_latency_min
Instead, you'd need to use two separate lines:
avg(abs_latency_ms) as avg_latency_ms
| avg_latency_ms / 1000 / 60 as avg_latency_min
Examples
* | parse "GET * " as url
| count by url
| sort by _count
| limit 10
status AND down
| parse regex "user=(?<user>.*?)"
| parse regex "host=(?<msg_host>.*?)"
| count by user
_sourceCategory=apache
| parse "* " as src_ip
| parse "GET *" as url
| count by src_ip
| sort by _count
| count(field1), avg(field2) group by field1, _timeslice
| max(amount) as amount_max, count(datetime) as datetime_count, sum(_size) as messages_size_sum, last(query) as last_query
All Sumo Logic system-generated fields begin with an underscore (_
). Group-by functions always create a Sumo Logic field named with a combination of an underscore (_
) and the function name. Using the function count
inserts a field into the pipeline called _count
. The function count_distinct
inserts a field into the pipeline called _count_distinct
.