What Carousell learned about scaling BI in the cloud

2 months ago 49

As companies like Carousell push more reporting into cloud data platforms, a bottleneck is showing up inside business intelligence stacks. Dashboards that once worked fine at small scale begin to slow down, queries stretch into tens of seconds, and minor schema errors ripple in reports. In short, teams find themselves balancing two competing needs: stable executive metrics and flexible exploration for analysts.

The tension is becoming common in cloud analytics environments, where business intelligence (BI) tools are expected to serve operational reporting and deep experimentation. The result is often a single environment doing too much – acting as a presentation layer, a modelling engine, and an ad-hoc compute system at once.

A recent architecture change inside Southeast Asian marketplace Carousell shows how some analytics teams are responding. Details shared by the company’s analytics engineers describe a move away from a single overloaded BI instance toward a split design that separates performance-critical reporting from exploratory workloads. While the case reflects one organisation’s experience, the underlying problem mirrors broader patterns seen in cloud data stacks.

When BI becomes a compute bottleneck

Modern BI tools allow teams to define logic directly in the reporting layer. That flexibility can speed up early development, but it also shifts compute pressure away from optimised databases and into the visualisation tier.

At Carousell, engineers found that analytical “Explores” were frequently connected to extremely large datasets. According to Analytics Lead Shishir Nehete, datasets sometimes reached “hundreds of terabytes in size,” with joins executed dynamically inside the BI layer, not upstream in the warehouse. The design worked – until scale exposed its limits.

Nehete explains that heavy derived joins led to slow execution paths. “Explores” pulling large transaction datasets were assembled on demand, which increased compute load and pushed query latency higher. The team discovered that 98th percentile query times averaged roughly 40 seconds, long enough to disrupt business reviews and stakeholder meetings. The figures are based on Carousell’s internal performance tracking, which was provided by the analytics team.

Performance was only part of the challenge: Governance gaps created additional risk and developers could push changes directly into production models without tight tests, which helped feature delivery but introduced fragile dependencies. A tiny error in a field definition could cause downstream dashboards to fail, forcing engineers to perform reactive fixes.

Separating stability from experimentation

Rather than continue to fine-tune the present environment, Carousell engineers chose to rethink where compute work should live. Heavy transformations were transferred upstream to BigQuery pipelines, where database engines are designed to perform large joins. The BI layer shifted toward metric definition and presentation.

The larger change came from splitting responsibilities in two BI instances. One environment was dedicated to pre-aggregated executive dashboards and weekly reporting. The datasets were prepared in advance, allowing leadership queries to run against optimised tables instead of raw transaction volumes.

The second environment remains open for exploratory analysis. Analysts can still join granular datasets and test new logic without risking performance degradation in their executive colleagues’ workflows.

The dual structure reflects a broader cloud analytics principle: isolate high-risk or experimental workloads from production reporting. Many data engineering teams now apply similar patterns in warehouse staging layers or sandbox projects. Extending that separation into the BI tier helps maintain predictable performance under growth.

Governance as part of infrastructure

Stability also depended on stronger release controls. BI Engineer Wei Jie Ng describes how the new environment introduced automated checks through Looker CI and Look At Me Sideways (LAMS), tools that validate modelling rules before code reaches production. “The system now automatically catches SQL syntax errors,” Ng says, adding that failed checks block merges until issues are corrected.

Beyond syntax validation, governance rules enforce documentation and schema discipline. Each dimension requires metadata, and connections must point to approved databases. The controls reduce human error while creating clearer data definitions, an important foundation as analytics tools begin to add conversational interfaces.

According to Carousell engineers, structured metadata prepares datasets for natural-language queries. When conversational analytics tools read well-defined models, they can map user intent to consistent metrics instead of guessing relationships.

Performance gains – and fewer firefights

After the redesign, the analytics team reported measurable improvements. Internal tracking shows those 98th percentile query times falling from over 40 seconds to under 10 seconds. The change altered how business reviews unfold. Instead of asking if dashboards were broken, stakeholders could concentrate on evaluating data live. Just as importantly, engineers could shift away from constant troubleshooting.

While every analytics environment has unique constraints, the broader lesson is straightforward: BI layers should not double as heavy compute engines. As cloud data volumes grow, separating presentation, transformation, and experimentation reduces fragility and keeps reporting predictable.

For teams scaling their analytics stacks, the question isn’t about tooling choice but around architectural boundaries – deciding which workloads belong in the warehouse and which live in BI.

See also: Alphabet boosts cloud investment to meet rising AI demand

(Photo by Shutter Speed)

Want to learn more about Cloud Computing from industry leaders? Check out Cyber Security & Cloud Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events, click here for more information.

CloudTech News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

Read Entire Article