Crosstabs in PostgreSQL: PivotMyTable

Crosstabs in PostgreSQL: PivotMyTable

PivotMyTable is a PL/Python function for use in PostgreSQL servers. Its aim is to get crosstab  tables in PostgreSQL in a more friendly way that PostgreSQL module tablefunc does with its crosstab series functions and in fact it behaves ,at last, as a proxy for tablefunc functions.

PivotMyTable makes possible to create crosstabs in PostgreSQL in the same way that other available solutions, automating the creation of the queries that the tablefunc crosstab functions need to work.

More precisely, two queries and one column specification are created:

  • First query specifies the categorization/grouping of the original table
  • Second query gets the list of summarized columns from pivot column.
  •  The column specification determines the type of the output columns.

The problem with another solutions is that they area, as far as I know, not enough dynamic to make crosstabs with ease. And also some of them just return you the query to be executed. PivotMyTable creates the table for you and makes a nice work regarding the type of output fields choosing its correct type, what makes unnecessary for the user to specify it.

Cat besides table

Well, not a crosstable in PostgreSQL at all…

Yet another nice feature is the ability of PivotMyTable to use more than one column to categorize the data. This is accomplished by creating a joined column prior to the crosstabulation, and subsequently re-creating the fields which origined the joined column.

Also, PivotMyTablee makes possible to get percentages in the pivoted tables, as well as get rid of null values in the oputput tables. This is a desirable feature, for instance to use the data in other application like R. Bear in mind that updating the pivoted columns one by one can be an ordeal if your table has more than maybe 20 or 30 columns… not to think about having 1K output columns!!!

PivotMyTable is released under GPL v3 license and can be downloaded from my GitHub repository. Feel free to contact me for suggestion or bug/error reporting.

PivotMyTable requires the PostgreSQL extension tablefunc and the language PL/Python installed in the database to work.

The usage of PivotMyTable is simple: Providing that you have a table myinfo like:

player tool round hits
Pepito Hammer Rd1 12
Pepito Hammer Rd2 13
Pepito Hammer Rd2 4
Pepito Wrench Rd5 1
Manu Wrench Rd1 12
Manu Wrench Rd1 16
Manu Hammer Rd2 3
Richal Hammer Rd3 42
Richal Hammer Rd1 17
Richal Hammer Rd4 22
Richal Hammer Rd2 15
Richal Hammer Rd1 17

You can issue this query:

To to create a crosstab in PostgreSQL like this:

player tool Rd1 Rd2 Rd3 Rd4 Rd5
Pepito Hammer 12 17 0 0 0
Pepito Wrench 0 0 0 0 1
Manu Hammer 0 3 0 0 0
Manu Wrench 28 0 0 0 0
Richal Hammer 34 15 42 22 0

Function Parameters/Options

All the parameters of the type varchar must be assigned between single quotes for PivotMyTable to work.

  • input_table (varchar): Name of the table to get data to pivot.
  • output_table (varchar): Name for the output table.
  • group_fields(varchar): Name of the column(s) for categorizing the data. Unlike the native tablefuncmodule functions crosstab, it accepts multiple columns, separating them with commas.
  • pivot_field(varchar): Column to be pivoted, must be a single column.
  • value_field(varchar): Data to aggregate for the pivot field. It is expected for value_field to be a numeric (int, float, etc) field.
  • agg_func(varchar): Aggregate function to apply to the data from value_field. It must be specified without brackets. It’s user responsability to check for function/data type compatibility and resulting data. The available(tested) functions are: sum, count and avg.
  • as_percentage(bool): This option enables the output as percentage of each output columns over its overall sum BY ROW. It defaults to FALSE, so to enable percentage calculation set it to TRUE. This option is only available with sum as the selected agg_func.
  • sort_order(varchar): This option enables sorting the resulting pivoted columns. You can specify ‘asc’, ‘desc’. It defaults to ‘no_sorting’.
  • drop_ex_tbl (boolean): pivotMyTablee checks if the output table specified with the parameter output_table exists before further processing. Setting this option to TRUE makes possible to automatically drop the existing table. By default, drop_ex_tbl is set to FALSE so the function exists with a warning if output_table already exists in the database.
  • as_view (boolean): This option enables the output to a view instead of to a table. To enable view output, set it to TRUE. Defaults to FALSE.

I hope this work helps you!!! Please feel free to drop me a tweet for feedback!!!