Skip to content

Summary table [WIP] #8427

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 184 commits into from
Closed

Summary table [WIP] #8427

wants to merge 184 commits into from

Conversation

r-kot
Copy link
Contributor

@r-kot r-kot commented Aug 30, 2018

This new visualization type automatically groups fields in the order they appear in the query results by default, but this grouping can be reordered. This visualization also displays totals and the grand total by default, but this can be turned toggled per grouping field in the options.

obraz

obraz

TODO:

  • improve description

  • from Summary table [WIP] #8427 (comment)

    • 1. change the default settings
    • 2. change the default sort order
    • 3. add support for nested pivot
    • 4. fix sorting from header
    • 5. fix strange state
    • 6. fix settings behaviour
    • 7. fix disappearing header for a pivot table
    • 8. fix header for pivot
    • 9. fix the size of labels for totals
  • fix drill-through

  • fix flow

  • refactor code

  • add tests

  • add comments

  • If there are changes to the backend codebase, run the backend tests with lein test && lein eastwood && lein bikeshed && lein docstring-checker && ./bin/reflection-linter

  • Run the frontend and integration tests with yarn lint && yarn flow && yarn test)

  • Sign the Contributor License Agreement
    (unless it's a tiny documentation change).

Sorry, something went wrong.

r-kot and others added 30 commits April 23, 2018 15:05
…ntly visible region only. This allows the cell data to be displayed "properly" and to be scrolled the same way other cells scroll.
@senior
Copy link
Contributor

senior commented Oct 23, 2018

I've tested out this PR and talked with @metabase/core-developers. Before reviewing Clojure style and other more minor changes to this PR, I think it would be good to talk about the approach at a higher level.

One of the problems we have had recently is XRays causing a heavy load on our backend. The issue wasn't XRays specifically, but rather XRays would create a dashboard with a lot of cards. Hitting that dashboard would cause an HTTP request for each card in the dashboard simultaneously. Each card executes a query, consumes a database connection and an HTTP request thread. This would cause a heavy load on the backend, sometimes causing Metabase instances to crash (#8387 #8312 #8295).

The current state of this PR will cause a similar problem I think. Currently columns are broken into two categories, columns that can be summarized and columns that can be grouped. Currently the grouped columns cause a query to be generated and simultaneously executed for each column. I tried summarizing some queries using our sample dataset and found 6-8 queries being issued. Each time an adjustment to the query results was made (i.e. change column width) it would cause those 6-8 queries to execute again.

I'm not exactly sure the best way get this functionality without the cost. One option would be to allow the backend to define the "superquery" and do so with unioned queries. This would allow a single HTTP call with the backend generating a large UNION ALL query with what was issued as individual queries in this PR. This would need to be tested to figure out if it would work. I can imagine this would be a fairly complex query to run and would be pretty large as well. In addition to that I don't think we currently support UNION/UNION ALL in MBQL.

Another option would be to separate the more complicated summary calculations (i.e. average) and just support simple calculations (sum, count etc) and do those calculations in memory. This would involve pulling the full resultset and doing the collapsing/groupings with the aggregates on the front end. We could then come back to the more complicated calculations later or find another/different way of supporting that functionality.

@metabase/core-developers chime in here if I've missed anything or if you have something to add.

@sbelak
Copy link
Contributor

sbelak commented Oct 23, 2018

All the aggregates we support aside from distinct count have a single pass formulation where you aggregate simpler statististics and derive the one you actually want from those. Eg. to calculate average you track the sum and count, and divide them in the last step. It turns out that all can also be merged in straightforwad way, which is exctly what we need for summaries (continuing the avg example, you sum the sums and sum the counts of constituent groups and have components of the avg for the total).

This approach would necessitate only a single relatively straightforward and performant query and no changes on the BE, at the cost of some additional fiddling on the FE. The other downside is, that we wouldn't be able to support distinct counts; and it potentially puts a constraint on what kind of aggregates we add in the future.

To get a better feel for the idea, have a look how these summary statstics are implemented in kixi https://github.com/MastodonC/kixi.stats/blob/v0.4.3/src/kixi/stats/core.cljc

If this is something we want to explore further I'm happy to prepare what we'd need in terms of query aggregates and how to combine them.

@mazameli
Copy link
Contributor

Someone asked in our meeting yesterday if this currently works with SQL queries. I tried it out and yes, it does:

screen shot 2018-10-23 at 1 57 02 pm

@tlrobinson
Copy link
Contributor

@r-kot First of all, this is an impressive PR, thanks for working on it.

@mazameli I was surprised it worked for SQL queries, but I noticed it's not actually doing the right thing for the summary of any aggregations besides count/sum such as average... it's doing a sum of the averages rather than the average of all the values in the grouping. This is true for both SQL and GUI queries (it looks like super queries are just doing nested queries rather than doing the same query with different breakouts, which is why it sort of works for native queries)

@r-kot
Copy link
Contributor Author

r-kot commented Oct 24, 2018

I like the idea of calculations in clojure, but i'm not sure what should be returned by endpoints (eg /api/card/:id/query). I prefer to get a list of DatasetData, somenthing similar to:
DatasetData = {
rows: Row[],
cols: Column[],
columns: ColumnName[];
additionalResults?: DatasetData[]
}

I implemented (in js) simple calculator for totals. It has bugs, but after some work we can use it instead of additional queries.
Calculations, unfortunately, are made on js primitives instead of decimals (sometimes precision of calculations matters).

@r-kot
Copy link
Contributor Author

r-kot commented Oct 29, 2018

Hi @metabase/core-developers,
I fixed an error regarding "avg" queries.
What is the next step?
Do you have any conclusions on how we should solve the problem with a heavy load on the backend?

@dannyeuu
Copy link

dannyeuu commented Nov 5, 2018

There are no labels here for when is planned to release this feature? =)

@jornh
Copy link
Contributor

jornh commented Nov 5, 2018

@dannyeuu correctly observed 🙂

Given concerns raised in the above discussion starting in #8427 (comment) I think that’s only appropriate. Even though the PR includes a lot of nice stuff (that I would also like to put my fingers on) if proper solutions/answers are not found to those concerns I’d rather wait.

Putting a milestone estimate on a PR like this as this stage is probably like answering the question “How long time does it take to catch a fish?”. I’d rather have the core developers judge by their gut feeling + facts/testing WHEN this is ready than risk introducing (too much) quality debt.

@rijnhardtkotze
Copy link

Do we have an update on this?

@r-kot
Copy link
Contributor Author

r-kot commented Nov 7, 2018

Hi @rijnhardtkotze,
I will try to fix the performance problem by the end of next week.

@rijnhardtkotze
Copy link

Hey @r-kot

We deployed your active branch and we are finding a few hiccups.

When deployed in an iFrame, it throws the following error:
Uncaught (in promise) TypeError: Cannot read property '1' of undefined

Here is the complete stack trace:

Uncaught (in promise) TypeError: Cannot read property '1' of undefined
    at buildResultProvider (summary_table.js:317)
    at SummaryTable._callee3$ (SummaryTable.jsx:166)
    at tryCatch (runtime.js:65)
    at Generator.invoke [as _invoke] (runtime.js:303)
    at Generator.prototype.(anonymous function) [as next] (https://new_metabase.howler.co.za/app/dist/vendor.bundle.js?f0c8eed4e0df846129c1:148115:21)
    at step (SummaryTable.jsx:28)
    at SummaryTable.jsx:28```

@agoud1993
Copy link

agoud1993 commented Dec 20, 2018

I want to do the totals for the fields which are highlighted.
subtot
subtot2
@mazameli how did you do that. It's not working for me either in custom query oir SQL editor. The option for using summary or pivot table is not even showing up for me.
My current version of Metabase is 0.31.2
and I'm using Firefox browser version 52.6.0.0

@FallenChromium
Copy link

FallenChromium commented Dec 21, 2018

@r-kot any things we could help about this PR? It seems really awesome and apart from performance concerns it feels like a finished part of Metabase. If you will describe goals you're trying to achieve, probably community here can help you!

@dannyeuu
Copy link

dannyeuu commented Apr 4, 2019

It's stopped this PR @mazameli ? It will be really nice to have this!

@FallenChromium
Copy link

@dannyeuu I suppose that asking Maz about this is not the best idea. Probably it would be better to reach @r-kot first, or ask core-developers if they can finish this PR

@rijnhardtkotze
Copy link

Leaving another bump here.

Any update?

@rijnhardtkotze
Copy link

Just following up again.

Any updates here? Super keen on getting this!

@dannyeuu
Copy link

Just following up again. Any updates here @tlrobinson ?

@skaboy007
Copy link

It is unfortunate that this feature is no longer being developed looks like a must for Metabase. a natural representation of a drill-down
@r-kot Cheers what do you think what will it take to complete this

@walterl
Copy link
Contributor

walterl commented Jan 30, 2020

This is a great feature, but, after some more discussion, this solution seems like it could have detrimental effects. Especially for users with big data sets.

That, and given how outdated this PR has become, I'm going to close it. If anyone else is willing to take a stab at it, feel free to take up the baton.

@walterl walterl closed this Jan 30, 2020
@gnosis93
Copy link

A must-have feature, I hope that it might get considered in future releases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet