Solution briefs

METL Getting Started Best Practices

Issue link:

Contents of this Issue


Page 0 of 1

Installation • Immediately after launch, make sure you can SSH on to your Matillion instance • To update Matillion ETL, create a new Matillion instance and migrate your work from the old instance to the new one. Further details available in How to Update Matillion – Best Practices • Take regular, automated backups of your metadata via Automated EBS snapshots from the Admin menu, an export through either the GUI or our API, using our Enterprise Git integration feature, or all of the above • If using a High Availability cluster, make sure your entire workload can be serviced by just one of the two nodes on its own • If you are relying on any OS customization (e.g. you have added libraries), ensure that you can automate the customizations • Have a procedure to follow in the event of operational problems (e.g. if you can't access your Matillion server, know who to contact for server administration and networking support) Design • Do ELT, not ETL. Load data as-is into the cloud data warehouse (CDW), then transform it within the CDW afterwards • Don't use Python scripts for data transformations (that's usually a sign that you are doing ETL, not ELT) • Avoid using iterators over data – Only iterate over metadata • AvoiditeratorswhentheCDWcandolargeoperationsallatonce,suchasloadingmultiplefilessimultaneously • Follow the best practices of your target cloud data warehouse when building transformations. See our product specificebooksbelow: ■ Matillion ETL for Amazon Redshift ■ Matillion ETL for Google BigQuery ■ MatillionETLforSnowflake • Avoid manual transactions if you can create idempotent logic. If you must use manual transactions, make them BIG • Don'thand-writeDMLstatements:useatransformationJob • Have a data model (i.e. don't create new tables with no organization or standards) • Plan for a dedicated staging layer in your data warehouse (such as a schema of objects that are always safe to drop or replace) • UseTransformationJobstoinsert/mergedatafromthestageintoanalytics/reportinglayerssuchasODS,Star,or Cube layers (can be in separate schemas as well) • TakeadvantageofSharedJobsinordertocreatereusablecomponents • WithinaTransformationJob,onlyhaveonewritecomponentpertargettable Matillion ETL : Best Practices To Get Started

Articles in this issue

Links on this page

view archives of Solution briefs - METL Getting Started Best Practices