Modern Analytics: Data Lakes, Data Warehouses, and Clouds

Issue link:

Contents of this Issue


Page 3 of 3

18 B I G D ATA Q U A RT E R LY | S P R I N G 2 0 2 1 sponsored content EVERY BUSINESS IS A DATA BUSINESS. Yet most organizations still struggle to capture and transform data from diverse and disparate sources to stay competitive. Today's data teams struggle with what we call the "Three Vs" of modern data: • Volume: This year, Matillion customers loaded 5.4 trillion rows of data per month into cloud data warehouses. And that number continues to go up. • Variety: Enterprise organizations used an average of 1,080 data sources in enterprise analytics (IDG, 2019). • Velocity: Business is moving faster than ever. Organizations need to act on information as close to real- time as possible, which means that information has to be available and ready for analytics. DATA TEAMS MUST IMPROVE EFFICIENCY TO KEEP UP For data teams to continue bringing data in from various sources and making it analytics-ready at the pace that the business demands (i.e., as soon as possible), the most effective strategy is to find efficiencies that speed up work: minimize the time it takes to do "custodial" data tasks like coding pipelines, do more work in parallel, and productionize workflows so that multiple team members can step in at any time. The right tools can help you improve data team efficiency and analytics productivity to not only reduce the workload of your teams but also help people across the organization get to insights faster. Here are four ways modern data teams can begin to move at modern speeds. 1. Reuse and borrow to create repeatable processes If you run a data analysis that yields useful knowledge for end users, they will want you to run that analysis again. And again. If you can build repeatable pipelines and processes into saved jobs, that will be immensely helpful to your future self. In coding, one of the first things developers do is look at libraries that they can reuse. Data tools should offer that same functionality. SLACK SAVES TIME WITH REPEATABLE PROCESSES Slack, a Matillion customer, used repeatable patterns to reduce the number of discrete workflows they had from 10 down to just one. By streamlining and productionizing efforts, they were able to reduce the time needed to generate new reports from six hours down to just 30 minutes. Those aren't incremental improvements– they're game-changers for the company. 2. Unleash the Cloud If you're not working with data in the cloud, you're missing out on a major opportunity to modernize how you work with data and move faster. The cloud is faster, more scalable, and more affordable than traditional data architectures. Add in cloud data warehouses and cloud-native data integration tools that are built to take full advantage of the speed and scale of the cloud, and data team productivity can skyrocket. REDUCE TIME TO INSIGHT SIGNIFICANTLY IN THE CLOUD Several Matillion customers have seen huge speed gains in the cloud. DocuSign reduced its ETL runtime by 72 percent in the cloud. The San Francisco Giants, working with Matillion partner Data Clymer, reduced time to new insights by 50 percent. 3. Leverage the Lakehouse architecture Different teams use data differently. Data scientists are likely to pull the data they need from a data lake, while data analysts and engineers work within a data warehouse. They are working in two different environments but duplicating data and processes, which creates extra work. This more traditional architecture shows a split, where one group goes off to a data lake to do data science and the other is working within a data warehouse environment. Why aren't their needs met using a central data team and data location? ENTER THE LAKEHOUSE The more modern approach is utilizing the Lakehouse. With a Lakehouse, you load data once, apply transformation and clean up the data once. Then you make sure all data teams have access to that nice, clean data, whether they're doing modeling, reporting, or any other activity. By consolidating data in a Lakehouse, you're consolidating work and making it possible to speed up analytics for faster time to insight. 4. Choose tools that foster collaboration Any tool or platform that enables collaboration is essential for working efficiently within data teams. Collaboration can mean different things. It could be working independently on parts of projects that will be combined later (for example, using Git). Or it can mean collaborating in real-time within a shared workspace. Ideally, you want a tool that supports both collaboration types. Ready to move faster? See how Matillion can help your data team improve efficiency. Request a demo at Matillion A Modern, Cloud-Native Approach to Accelerating Data Insights

Articles in this issue

Links on this page

view archives of Reports - Modern Analytics: Data Lakes, Data Warehouses, and Clouds