Pinot offers various ways to assist with troubleshooting and debugging problems that might happen. It is recommended to start off with the debug api which may quickly surface some of the commonly occurring problems. The debug api provides information such as tableSize, ingestion status, any error messages related to state transition in server, among other things.
The table debug api can be invoked via the Swagger UI as follows:
It can also be invoked directly by accessing the URL as follows. The api requires the tableName
, and can optionally take tableType (offline|realtime)
and verbosity
level.
curl -X GET "http://localhost:9000/debug/tables/airlineStats?verbosity=0" -H "accept: application/json"
Pinot also provides a wide-variety of operational metrics that can be used for creating dashboards, alerting and monitoring. Also, all pinot components log debug information related to error conditions that can be used for troubleshooting.
Please use these steps:
- If the query executes, look at the query result. Specifically look at
numEntriesScannedInFilter
andnumDocsScanned
.- If
numEntriesScannedInFilter
is very high, consider adding indexes for the corresponding columns being used in the filter predicates. You should also think about partitioning the incoming data based on the dimension most heavily used in your filter queries. - If
numDocsScanned
is very high, that means the selectivity for the query is low and lots of documents need to be processed after the filtering. Consider refining the filter to increase the selectivity of the query.
- If
- If the query is not executing, you can extend the query timeout by appending a
timeoutMs
parameter to the query (eg:select * from mytable limit 10 option(timeoutMs=60000)
). Then you can repeat step 1. - You can also look at GC stats for the corresponding Pinot servers. If a particular server seems to be running full GC all the time, you can do a couple of things such as
- Increase JVM heap (Xmx)
- Consider using off-heap memory for segments
- Decrease the total number of segments per server (by partitioning the data in a better way)