Composable Testcontainers fixtures for local big-data integration tests.
bigdata-test starts only the services a test asks for and exposes their
connection properties through a small Kotlin/JUnit API. Heavier setup such as
S3 JCEKS generation, bucket creation, and Kafka Avro seeding lives in the
optional extensions module so core and junit5 stay lightweight.
core: container builder, service options, endpoints, and log routingjunit5:@BigDataTestextension and parameter injectionextensions: config-driven setup hooks for JCEKS, buckets, Kafka Avro, and Kerberos materialbigdata-test-spring-boot-autoconfigure: Spring Boot auto-configurationbigdata-test-spring-boot-starter: Spring Boot starterexample:junit: plain JUnit 5 examplesexample:spring: Spring Boot exampleexample:spark: Spark, HMS, Kafka, S3, GCS, Iceberg, and Kerberos smoke tests
Use @BigDataTest in a JUnit 5 test and request BigDataTestKit as a parameter:
@BigDataTest(
hiveMetastore = true,
kafka = true,
schemaRegistry = true,
localStackS3 = true,
)
class MyIntegrationTest {
@Test
fun test(kit: BigDataTestKit) {
val metastoreUri = kit.endpoint(BigDataService.HIVE_METASTORE)
.property("hive.metastore.uris")
val bootstrapServers = kit.endpoint(BigDataService.KAFKA)
.property("bootstrap.servers")
}
}For declarative test setup, add @BigDataExtensions with TOML config:
@BigDataExtensions("classpath:bigdata-extensions.toml")
@BigDataTest(hdfs = true, kafka = true, schemaRegistry = true, localStackS3 = true)
class MyIntegrationTest[s3Jceks]
enabled = true
hdfsDir = "/bigdata-test/demo"
fileName = "s3.jceks"
[kafkaAvro]
enabled = true
[[kafkaAvro.topics]]
name = "events"
schema = "classpath:schemas/event.avsc"
records = [
{ key = "alpha", value = { id = 1, name = "alpha" } },
]HTTP services can be exposed through an HAProxy TLS gateway from TOML:
[services]
localStackS3 = true
[localStackS3Tls]
enabled = true
domain = "localhost"The endpoint properties then return HTTPS URLs and JVM truststore settings such as javax.net.ssl.trustStore.
Use the shared Gradle home when running this repository locally:
GRADLE_USER_HOME=/data/.gradle ./gradlew checkRun the Spark smoke test:
GRADLE_USER_HOME=/data/.gradle ./gradlew :example:spark:testRun the Spark dependency/HMS/Kerberos matrix:
GRADLE_USER_HOME=/data/.gradle ./gradlew :example:spark:sparkBigDataMatrixTestIndividual matrix cells are also available. The first axis selects the Spark/Hadoop dependency line, then HMS implementation, then Kerberos:
GRADLE_USER_HOME=/data/.gradle ./gradlew :example:spark:sparkApacheDepsApacheHmsTest
GRADLE_USER_HOME=/data/.gradle ./gradlew :example:spark:sparkApacheDepsApacheHmsKerberosTest
GRADLE_USER_HOME=/data/.gradle ./gradlew :example:spark:sparkApacheDepsClouderaHmsTest
GRADLE_USER_HOME=/data/.gradle ./gradlew :example:spark:sparkApacheDepsClouderaHmsKerberosTest
GRADLE_USER_HOME=/data/.gradle ./gradlew :example:spark:sparkClouderaDepsApacheHmsTest
GRADLE_USER_HOME=/data/.gradle ./gradlew :example:spark:sparkClouderaDepsApacheHmsKerberosTest
GRADLE_USER_HOME=/data/.gradle ./gradlew :example:spark:sparkClouderaDepsClouderaHmsTest
GRADLE_USER_HOME=/data/.gradle ./gradlew :example:spark:sparkClouderaDepsClouderaHmsKerberosTest- Detailed usage: doc/user-guide.adoc
- Contributing: CONTRIBUTING.md
- Current Hive Docker HMS notes: doc/hive-docker-hms-issues.adoc