Unit testing SQL with PySpark
Machine-learning applications frequently feature SQL queries, which range from simple projections to complex aggregations over several join operations. There doesn’t seem to be much guidance on how to verify that these queries are correct. All mainstream programming languages have embraced unit tests as the primary tool to verify the correctness of the language’s smallest building […]