-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Summary table [WIP] #8427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Summary table [WIP] #8427
Conversation
# Conflicts: # package.json
…ntly visible region only. This allows the cell data to be displayed "properly" and to be scrolled the same way other cells scroll.
I've tested out this PR and talked with @metabase/core-developers. Before reviewing Clojure style and other more minor changes to this PR, I think it would be good to talk about the approach at a higher level. One of the problems we have had recently is XRays causing a heavy load on our backend. The issue wasn't XRays specifically, but rather XRays would create a dashboard with a lot of cards. Hitting that dashboard would cause an HTTP request for each card in the dashboard simultaneously. Each card executes a query, consumes a database connection and an HTTP request thread. This would cause a heavy load on the backend, sometimes causing Metabase instances to crash (#8387 #8312 #8295). The current state of this PR will cause a similar problem I think. Currently columns are broken into two categories, columns that can be summarized and columns that can be grouped. Currently the grouped columns cause a query to be generated and simultaneously executed for each column. I tried summarizing some queries using our sample dataset and found 6-8 queries being issued. Each time an adjustment to the query results was made (i.e. change column width) it would cause those 6-8 queries to execute again. I'm not exactly sure the best way get this functionality without the cost. One option would be to allow the backend to define the "superquery" and do so with unioned queries. This would allow a single HTTP call with the backend generating a large Another option would be to separate the more complicated summary calculations (i.e. average) and just support simple calculations (sum, count etc) and do those calculations in memory. This would involve pulling the full resultset and doing the collapsing/groupings with the aggregates on the front end. We could then come back to the more complicated calculations later or find another/different way of supporting that functionality. @metabase/core-developers chime in here if I've missed anything or if you have something to add. |
All the aggregates we support aside from distinct count have a single pass formulation where you aggregate simpler statististics and derive the one you actually want from those. Eg. to calculate average you track the sum and count, and divide them in the last step. It turns out that all can also be merged in straightforwad way, which is exctly what we need for summaries (continuing the avg example, you sum the sums and sum the counts of constituent groups and have components of the avg for the total). This approach would necessitate only a single relatively straightforward and performant query and no changes on the BE, at the cost of some additional fiddling on the FE. The other downside is, that we wouldn't be able to support distinct counts; and it potentially puts a constraint on what kind of aggregates we add in the future. To get a better feel for the idea, have a look how these summary statstics are implemented in kixi https://github.com/MastodonC/kixi.stats/blob/v0.4.3/src/kixi/stats/core.cljc If this is something we want to explore further I'm happy to prepare what we'd need in terms of query aggregates and how to combine them. |
@r-kot First of all, this is an impressive PR, thanks for working on it. @mazameli I was surprised it worked for SQL queries, but I noticed it's not actually doing the right thing for the summary of any aggregations besides count/sum such as average... it's doing a sum of the averages rather than the average of all the values in the grouping. This is true for both SQL and GUI queries (it looks like super queries are just doing nested queries rather than doing the same query with different breakouts, which is why it sort of works for native queries) |
I like the idea of calculations in clojure, but i'm not sure what should be returned by endpoints (eg /api/card/:id/query). I prefer to get a list of DatasetData, somenthing similar to: I implemented (in js) simple calculator for totals. It has bugs, but after some work we can use it instead of additional queries. |
Hi @metabase/core-developers, |
There are no labels here for when is planned to release this feature? =) |
@dannyeuu correctly observed 🙂 Given concerns raised in the above discussion starting in #8427 (comment) I think that’s only appropriate. Even though the PR includes a lot of nice stuff (that I would also like to put my fingers on) if proper solutions/answers are not found to those concerns I’d rather wait. Putting a milestone estimate on a PR like this as this stage is probably like answering the question “How long time does it take to catch a fish?”. I’d rather have the core developers judge by their gut feeling + facts/testing WHEN this is ready than risk introducing (too much) quality debt. |
Do we have an update on this? |
Hi @rijnhardtkotze, |
Hey @r-kot We deployed your active branch and we are finding a few hiccups. When deployed in an iFrame, it throws the following error: Here is the complete stack trace:
|
I want to do the totals for the fields which are highlighted. |
@r-kot any things we could help about this PR? It seems really awesome and apart from performance concerns it feels like a finished part of Metabase. If you will describe goals you're trying to achieve, probably community here can help you! |
It's stopped this PR @mazameli ? It will be really nice to have this! |
Leaving another bump here. Any update? |
Just following up again. Any updates here? Super keen on getting this! |
Just following up again. Any updates here @tlrobinson ? |
It is unfortunate that this feature is no longer being developed looks like a must for Metabase. a natural representation of a drill-down |
This is a great feature, but, after some more discussion, this solution seems like it could have detrimental effects. Especially for users with big data sets. That, and given how outdated this PR has become, I'm going to close it. If anyone else is willing to take a stab at it, feel free to take up the baton. |
A must-have feature, I hope that it might get considered in future releases |
This new visualization type automatically groups fields in the order they appear in the query results by default, but this grouping can be reordered. This visualization also displays totals and the grand total by default, but this can be turned toggled per grouping field in the options.
TODO:
improve description
from Summary table [WIP] #8427 (comment)
fix drill-through
fix flow
refactor code
add tests
add comments
If there are changes to the backend codebase, run the backend tests with
lein test && lein eastwood && lein bikeshed && lein docstring-checker && ./bin/reflection-linter
Run the frontend and integration tests with
yarn lint && yarn flow && yarn test
)Sign the Contributor License Agreement
(unless it's a tiny documentation change).