Simetrique #2: Present and Future
The one about Analytics Engineering, GPT-4, and ETL market
Welcome to the new issue of Simetrique!
In this edition, we’ll be covering three topics that are shaping the present and future of the data world.
First, we will be discussing the recently published report called “The State of Analytics Engineering”, which provides insights into the current state of analytics engineering.
Second, the exciting news about the announcement of GPT-4, the latest addition to the GPT series of language models and what it means for us — data engineers.
Finally, an interesting prediction about the future of the ETL market, which is sure to have a significant impact on the data industry in the years to come.
The state of Analytics Engineering
Folks from dbt Labs released the report called “The State of Analytics Engineering”. It’s their first report of such kind, they made a survey of 567 data practicioners and shared the results with a wide public. I picked-up for myself a couple of interesing points that I’m gonna share here.
First interesting insight is about the distributing of responsibilities within the data team. From the answers it’s clear that data engineers are usually involved as cross-functional specialists (e.g. they are not dependant on the business area), while analytics engineers tend to work closely with some business units. Sounds like a truth, at least at my job we have the same distribution.
Next, most of their time teams dedicate to maintaining data sets. On the second place is maintaining a platform. For me, it is a positive trend because a valuable time is spent on business needs, rather than infrastructure. (Although, it might be a biased trend because most of the responders are working closely with analytics).
Compensation. Most of engineers from North America get more than $100K per year, as well as data analysts and data scientists. However in Europe they are in $50-$100K brackets, while managers still can have more than $100K.
Lastly, I wanted to point out two interesting questions:
Do you agree with this statement: “My organization sets clear goals for the data team. We have a roadmap on how to execute.”
About 52% responded positively to that, 36% —negatively.Do you agree with this statement: 'My organization values the data team. We are respected and included in decisions that impact our work.'
About 68% responded positively, 22% —negatively.
It means that majority of companies already understand the value of data, but only half of them are setting clear goals and plan their work.
AI week of the year (so far)
Last wee was full of AI announcements and news. Joust check out a list of production from this post. However, I’d like to stick to GPT-4 release in here.
GPT-4 is bigger, better and more expensive model compared to previous versions, yet with a much better results. For example, it can pass many exams and hit 90th percentice in gained points. Or, it can make a summary from the given text and make all words start with “Q”.
But what it gives us, data specialists?
First of all, new model has much bigger input context (e.g. how many words you give as an input task). I imagine that that you can give it a docs from Apache Spark and ask any Spark-related questions. (However, it might be an expensive pleasure)
Second of all, I believe that sooner or later it will be integrated with Github Copilot, code writing assistant. It is already a very good assistant, but new “backend“ will likely make it even more powerfull, with a better context-aware suggestions or docs generations based on the codebase.
Lastly, people will make crazy smart bots, that will write an SQL by taking data requests from business people. Yeah, the future is here.
ETL market and Fivetran
My last bit about the future. Read this Jeffrey Richman’s post called “Fivetran is the new Xerox of data”.
Basically, one day the whole ecosystem of Fivetran might collapse if giants like Snowflake or BigQuery will include ETL of 3rd party data as their standard feature. Internet killed paper and Xerox. Some big brand may kill other big brand just by releasing a piece of software.
Indeed, what a fascinating time to live.
That’s all for today.
Have a great week and see you next time 👋




