Dataproc is a managed Spark and Hadoop service that lets you take advantage of open source data tools for batch processing, querying, streaming, and machine learning. For more details, refer to the GCP documentation
Log and metric types
You can collect the logs and metrics for Sumo Logic's Google Cloud Dataproc integration by following the below steps.
Configure logs collection
Collect Audit Logs using the Google Cloud Platform source. These Audit Logs can be accessed based on the permissions and roles. To enable logging for Google Dataproc, refer to Google documentation. For more detail on Dataproc operations being audited, refer to audited operations. While creating the sync in GCP, as part of the Choose logs to include in sink section, you can use the following query:
Collect Platform Logs using the Google Cloud Platform source. Dataproc platform logs include job logs and cluster logs. Here are the permissions required to access job and cluster logs. While creating the sync in GCP, as part of the Choose logs to include in sink section, you can use the following query:
(resource.type=(cloud_dataproc_cluster OR cloud_dataproc_job))