Support Questions

Find answers, ask questions, and share your expertise

Apache - Solr : Need help with group by on solr documents and sum on different field values

avatar
Cloudera Employee

Hello guys,

Can someone please help me, how I can write a solr query where results to be grouped on one field (text type) and sum on another field. Here is my documents results:

 

 

{
"responseHeader": {
"zkConnected": true,
"status": 0,
"QTime": 30,
"params": {
"q": "*:*",
"doAs": "knoxui",
"fl": "access, agent, repo, resource, resType, event_count",
"fq": [
"access:read",
"repo:cm_ozone",
"resType:key",
"action:read"
],
"_forwardedCount": "1",
"_": "1674633786010"
}
},
"response": {
"numFound": 8,
"start": 0,
"docs": [
{
"access": "read",
"agent": "ozone",
"repo": "cm_ozone",
"resource": "volume1/fso-bucket/dir1/dir2/dir3",
"resType": "key",
"event_count": 1
},
{
"access": "read",
"agent": "ozone",
"repo": "cm_ozone",
"resource": "volume1/fso-bucket/dir1",
"resType": "key",
"event_count": 1
},
{
"access": "read",
"agent": "ozone",
"repo": "cm_ozone",
"resource": "volume1/fso-bucket/dir1/test3.txt",
"resType": "key",
"event_count": 1
},
{
"access": "read",
"agent": "ozone",
"repo": "cm_ozone",
"resource": "volume1/fso-bucket/dir1/dir2",
"resType": "key",
"event_count": 1
},
{
"access": "read",
"agent": "ozone",
"repo": "cm_ozone",
"resource": "volume1/fso-bucket/dir1/dir2/dir3",
"resType": "key",
"event_count": 1
},
{
"access": "read",
"agent": "ozone",
"repo": "cm_ozone",
"resource": "volume1/fso-bucket/dir1",
"resType": "key",
"event_count": 1
},
{
"access": "read",
"agent": "ozone",
"repo": "cm_ozone",
"resource": "volume1/fso-bucket/dir1/test3.txt",
"resType": "key",
"event_count": 1
},
{
"access": "read",
"agent": "ozone",
"repo": "cm_ozone",
"resource": "volume1/fso-bucket/dir1/dir2",
"resType": "key",
"event_count": 1
}
]
}
}

 

Here I need something like equivalent to SQL:

 

select resource, count(event_count) from <docs> group by resource;

1 ACCEPTED SOLUTION

avatar
Cloudera Employee

Ok guys, I figured this out myself:

 

q=*:*&wt=json&fl=access%2C%20agent%2C%20repo%2C%20resource%2C%20resType%2C%20event_count&fq=access%3Aread&fq=repo%3Acm_ozone&fq=-repoType%3A7&fq=resType%3Akey&fq=evtTime%3A%5B2023-01-23T18%3A30%3A00Z+TO+NOW%5D&
json.facet={
resources:{
type : terms,
field : resource,
facet:{
read_access_count : "sum(event_count)"
}
}
}'

View solution in original post

1 REPLY 1

avatar
Cloudera Employee

Ok guys, I figured this out myself:

 

q=*:*&wt=json&fl=access%2C%20agent%2C%20repo%2C%20resource%2C%20resType%2C%20event_count&fq=access%3Aread&fq=repo%3Acm_ozone&fq=-repoType%3A7&fq=resType%3Akey&fq=evtTime%3A%5B2023-01-23T18%3A30%3A00Z+TO+NOW%5D&
json.facet={
resources:{
type : terms,
field : resource,
facet:{
read_access_count : "sum(event_count)"
}
}
}'