AWS Glue provides serverless data integration spanning schema definition and ETL.
Glue Data Catalog is based on Apache Hive.
- Databases define logical groups of table definitions.
- Tables define metadata about tables, including schema.
- Crawlers attempt classifiers to determine the schema for the source data and create the table metadata. Crawlers can work with data partitioned across multiple files, given a directory with a trailing