About two weeks ago I got into a debate with a teammate about unit testing Hadoop Pig scripts. My colleague's view was that only the UDFs utilized by the scripts should be unit tested. No one should use PigUnit to unit test a script. He thought the whole idea was silly. He drew an equivalence to attempting to unit test a SQL call. Would anyone ever be silly enough to actually test a select statement? My answer is "I do!"
The debate was over a mix of semantics and philosophy. He was getting caught up in the use of the word unit when I said I wanted to unit test. To him, a pig script, or apparently a SQL statement is a non-unit testable component because it requires integration with some supporting system. One cannot easily mock Pig (you can mock a database, but there is some debate as to the necessity of such actions). In his mind having the direct dependency on Pig made the test an integration test and, as such, should not require regression tests or even correctness tests.
I wanted to have a repeatable set of tests that proved that each major component of the script was doing its job. Our scripts are not simple scripts that are tossed about as examples in various blogs. Our scripts often require 10+ macro definitions, each with 5+ lines of PIG often requiring UDFs. To not test such code is negligence. Our entire system requires these scripts to rip through gigs of data looking for complex patterns and find fraud. We have to know before deploying the code to even the developer's cluster that they work as expected over a variety of use cases.
As a result of this discussion, I've come to the conclusion that no one should "unit test". They should instead just test. Qualifying the test type just opens up the conversation to religious debate. The goal should be to have test coverage at at least 80% of the code base. The goal should be to do this in such a way as to isolate the code that you really care about when you test.
Looking at the testing problem this way might be a substantial change to how one develops regularly. For example, I don't black box test my code. I have a set of tests that check the conformance to the contracts, but I also test the underlying code to make sure that its implementation specific features are doing what they should. If I'm testing a DAO, I make the test transactional so I can insert or delete data to check that my DAO's data mappings are working. Is this a unit test? No, probably not. Should this be part of the normal build? Absolutely! If you don't have such tests as mainline tests, you could be shipping a product that will fail once deployed.
Approaching the problem from this perspective will improve your code quality. It doesn't add too much more time to the development cycle. It might even save it if you are in an IBM shop where WAS takes 2-3 minutes to boot, let alone the added time to navigate the system and get to your change. This approach works well with both TDD and BDD concepts.
There are two points when you should start wondering if you are testing properly. The first is if your builds start taking more than a few minutes to complete as a result of the testing phase. This might mean you have too many small tests that could be consolidated. It might mean you've got a test that is acting as an integration test where a it should be refactored to use mocks and isolate the components. The second is when have a component that has high coupling with external modules. If you cannot use DI to slip in mocks to other parts, you probably aren't testing well and you probably aren't designing well. Inversion of control will help break down your problem into smaller, testable bite-size parts.
Avoiding labels is not always possible. But even when you can't, please remember that the label is just a concept to help you frame the problem. When you start fighting over the label's font size, so to speak, and not solving the problem, you've got to find your way back to the path and carry on.