The digital universe is doubling in size every year, and is expected to reach 44 trillion gigabytes by 2020. Up to 90 percent of that data is unstructured or semi-structured, which presents a two-fold challenge: find a way to store all this data and maintain the capacity to process it quickly. This is where a data lake comes in.