Home Forums Power Pivot Summing up attributes in dimension table

This topic contains 14 replies, has 2 voices, and was last updated by  tomallan 5 months, 3 weeks ago.

Viewing 15 posts - 1 through 15 (of 15 total)
  • Author
    Posts
  • #9188

    stefandrescher
    Participant
    • Started: 2
    • Replies: 11
    • Total: 13

    Hi there!

    I ran into a problem and I can’t find any helpful posts on the net.

    First of all I have to use PowerPivot v1 due to IT restrictions in my organisation.

    My data model currently consits of a fact table containing revenue data on a transaction level and a bunch of dimension tables including a date table. See attched pdf. Now I wanted to add a campaign dimension. As I only need the number of recipients on a campaign level I did not want to add another big dimension table that would contain all the customers in relation to the campaigns. I know that this would probably be the way to go but this would result in a massive amount of data with limited insight. So I precalculated the count of the members on a campaign level in SQL.

    When I now write a simple measure to sum up the members the pivot table shows a lot of campaigns that do not belong to the corresponding campaign group (a campaign group groups the campaigns that have been carried out for example on Christmas. It is a seperate dimension connected to the fact table – see also attched PDF).

    First thing that I found out was, that there is a problem with campaigns that do not have an entry in the fact table. Some campaigns did not generate any revenue. I guess as PowerPivot does not know where to put them it just throws them in everywhere which leads to the problem that the filter of the campaign group can not be applied correctly.

    So I created a calculated column that checks if the campaign has an entry in the fact table. If yes it returns 1, if not it returns 0. Than I created a measure to check for this.

    =CALCULATE(SUMX(‘Dim_campaign’,’Dim_campaign'[member_count]),FILTER(‘Dim_campaign’,’Dim_campaign'[fact_check]=1))

    This did not change anything at all. Hower if I put the fact check directly into the measure it looks much better:

    =CALCULATE(SUMX(‘Dim_campaign’,’Dim_campaign'[member_count]),FILTER(‘Dim_campaign’,COUNTROWS(RELATEDTABLE(Fact_Revenue)) > 0))

    So that was the first thing I really did not understand. Why is there a difference in the behaviour of PowerPivot?

    After checking the numbers I ran into the next issue. If there is a campaign within a campaign group that did not generate any revenue I still want to show the members that have been addressed. With my current measure they get filtered out. So my question would be how to set this up that the campaign group filter works correctly?

    I can’t get my head around it. Any help is highly appreciated! Thanks!

    Stefan

    Attachments:
    You must be logged in to view attached files.
    #9192

    tomallan
    Keymaster
    • Started: 9
    • Replies: 2377
    • Total: 2386

    Stefan,

    Glad to hear that you have SQL experience which you can use to shape your datasets that make up your PowerPivot models. Also, your choice to aggregate the count of members in the Dim_Campaign table sounds very practical.

    Now to answer your questions…

    Putting the fact check in the measure makes more sense because calculated columns are static (always the same until the next data refresh). So no matter how you filter and slice a pivot or a DAX query, the value for the calculated column for fact check will remain the same. Your experience with this static behavior of calculated columns is reflected in your statement that the calculated column “did not change anything at all”.

    In contrast, a measure responds dynamically to the existing filters (what is on pivot rows, columns, pivot filter, and slicers). You can think of measures as a formula with a dynamic “WHERE” clause that is reformulated in each pivot cell based on the current and modified filter context. In contrast, you can think of a calculated column that has a “WHERE” clause that is only applied when the entire table is refreshed in the data model.

    So, to answer your first question, calculated columns and measure behave differently because they refresh differently.

    For your other question, you may/may not know that you have setup a many-to-many relationship between Dim_Campaign and Dim_Campaigngroup that is probably totally unneccesary where fct_Revenue is the bridge table (so as long as a campaign generates no revenue, the campaign group filter can never be applied to the campaign table. Almost certainly Dim_Campaigngroup should be either a column in or a parent table to Dim_Campaign. IF Dim_Campaigngroup contains only an ID column and the name of the campaign group, you should use your SQL skills to move the campaign group name to the campaign table (star-schema dimensions are almost always a better solution than a snow-flaked dimensions).

    A question for you: how many unique stakeholder attributes do you track? If the total number of attributes is relatively fixed, you could get away from the many-to-many relationship between customer and stakeholder attributes (model would be simpler to work with).

    If you do not already have a copy of “Power Pivot and Power BI” by Rob Collie and Avi Singh, I highly recommend the reference book: between the book and the forums and your experience, you will find answers to all of your DAX and data model questions.

    Tom

    #9208

    stefandrescher
    Participant
    • Started: 2
    • Replies: 11
    • Total: 13

    Hello Tom,

    thank you so much for your detailed answers! They are a great help!

    I think I got the difference between the fact check in the table or in the measure.

    No, I didn’t realize that I created a many to many relationship between the two dimensions. Now you are mentioning it, it absolutely makes sense. But isn’t that always the case? Isn’t there also an m-to-n relationship between Dim_campaign and Dim_customer?

    You are right. Dim_Customergroup has only an ID and the name. However I struggle with the idea of parent-child hierarchies so far. The idea is that I join the name to each campaign so that every row now has a campaign group name and then I link the campaignID to the fact table, right?

    In this case that makes sense to me. However related to your question I was not able to apply this concept to the stakeholder attributes table. In reality there is also a parent-child relationship here but I modeled it as 3 separate dimensions related to the bridge table.

    The hierarchy looks like this: group > role > detail

    Every group has a role but not all roles have details. So applying the idea from above I would use the detailsID to connect them with the customer. This would lead to the issue that the roles that have no detail would not be displayed. As I have to use PowerPivot v1 I tried this way https://www.sqlbi.com/blog/alberto/2011/02/15/powerpivot-and-parent-child-hierarchies/ but I have not succeeded. It seems that all these issue have in common that I don’t really know how to model these types of relationships.

    I do have Rob’s book “DAX Formulas for PowerPivot”. I haven’t bought any newer version because usually I can’t apply the techniques as I am lacking the required functions as I have to use PowerPivot v1.

    Thanks again for your great help!

    Stefan

    #9209

    tomallan
    Keymaster
    • Started: 9
    • Replies: 2377
    • Total: 2386

    Stefan,

    Your kind words are appreciated!

    After taking a little more thought, I would like to ask a few more questions before continuing with suggestions for your model design:

    Does a campaign ever belong to more than one campaign group?

    Does a customer ever belong to more than one “stakeholder” group?

    Tom

    #9218

    stefandrescher
    Participant
    • Started: 2
    • Replies: 11
    • Total: 13

    Hi Tom,

    thanks for your effort!

    No, a campaign always just belongs to one campaign group. I also just dicovered that there are two more issues with that. There are revenue transactions on the campaign group level with no corresponding campaign. Logically this does not make any sense but as the source system obviously allows this it should be covered. Also there campaign groups with no corresponding campaigns.

    Yes, a customer can belong to more than one stakeholder group.

    Your help is greatly appreciated!

    Stefan

    #9220

    tomallan
    Keymaster
    • Started: 9
    • Replies: 2377
    • Total: 2386

    Stefan,

    So there is a many-to-many relationship between customers and stakeholder groups. It might be interesting and worth the discussion to evaluate the structure of the bridge table and stakeholder attributes table. Let me know if you would like to continue discussing this item.

    Regarding campaigns and campaign groups, you did mention that the current configuration may not be producing accurate results: “When I now write a simple measure to sum up the members the pivot table shows a lot of campaigns that do not belong to the corresponding campaign group…”. This could be traceable to an either an error of data input *OR* to an underlying but hidden fact that there really is a many to many relationship between campaigns and campaign groups. We could also pursue this topic if you choose.

    Tom

    #9221

    stefandrescher
    Participant
    • Started: 2
    • Replies: 11
    • Total: 13

    Dear Tom,

    thanks again for your prompt answer and your offer to discuess this further!

    I attached a more detailed plan of the the data model.

    The bridge table contains of contactid (Dim_Customer_private), accountid (Dim_Customer_company), stakeholdergroupid, stakeholderroleid, stakeholderdetailid.

    Also I moved the Dim_Campaigngroup over to serve as a parent table and related it to the campaign table. I was hoping that in this way it would also show the campaigngroups that do not have any corresponding campaigns with a revenue of 0 in the pivot table. There was also a small hope that it shows the revenue of campaings where  transactions exist on a campaigngroup level but not on the campaign grain.

    Thank you very much!

    Stefan

    Attachments:
    You must be logged in to view attached files.
    #9233

    tomallan
    Keymaster
    • Started: 9
    • Replies: 2377
    • Total: 2386

    Stefan,

    The default behavior for items on pivot table Rows or Columnswith blanks in all measures is to not show that specific row or column. That behavior can be overwritten in the Pivot Table options dialog on the display tab.

    Also, you can override the use of blanks in numeric fields of a pivot with a zero (also in the pivot table options dialog).

    There are workaround options to show revenues of existing CampaignGroups that do not have a campaign, but would require steps in Power Query or at the data source. One general approach would be to re-use the CampaignGroup name in the campaign table. This approach could work *IF* it is absolutely impossible for a campaign and a campaign group to have the same ID value. It could also work otherwise, but you would need to do something to differentiate a re-used group ID from the other campaign ID’s. IF you would like to explore this idea, just let me know.

    Tom

    #9234

    stefandrescher
    Participant
    • Started: 2
    • Replies: 11
    • Total: 13

    Hi Tom,

    thanks again for your answer.

    If I overwrite this option as you suggested it shows all rows without any data and repeats that in every campaingroup. So it seems that it loses the filter context of campaigngroup.

    I am afraid I don’t really understand this idea. From my understanding I would move the Campaigngroup name into Campaign table with a LEFT JOIN. In this way I get campaigngroups that don’t have a campaign (campaign_ID is NULL). But I can’t relate the new Campaigngroup dimension to the fact table as this would be a many-to-many relationship. I guess the example attached makes my probably wrong understanding of this clearer than all my writing.

    Concerning the stakeholder set up: Does that make sense to you?

    Thanks a lot (again!),

    Stefan

    Attachments:
    You must be logged in to view attached files.
    #9240

    tomallan
    Keymaster
    • Started: 9
    • Replies: 2377
    • Total: 2386

    +Stefan,

    If the workbook is connecting to tables in a relational database like SQL Server, and if the source tables had the name and structure shown in the workbook, the SQL code could look like this:

    CREATE VIEW vwCampaigns AS 
    SELECT 
    	 Campaigngroup_Name AS CampaignGroup, 
    	 CASE
    		WHEN Campaign_ID IS NULL 
    			THEN 'G|'+ CAST ( Campaigngroup_ID AS NVARCHAR (255) ) 
    		ELSE 
    			Campaign_ID 
    	END AS Campaign_ID, 
    	CASE  
    		WHEN Campaign_Name IS NULL 
    			THEN Campaigngroup_Name
    		ELSE
    			Campaign_Name 
    		END AS Campaign
    FROM dbo.Dim_CampaignGroup_New
    
    GO
    
    CREATE VIEW vwRevenue AS SELECT 
    	Transaction_ID,
    	CASE 
    		WHEN Campaign_ID IS NULL AND Campaigngroup_ID IS NOT NULL 
    			THEN 'G|'+ CAST ( Campaigngroup_ID AS NVARCHAR (255) ) 
    		ELSE Campaign_ID
    	END AS Campaign_ID, 
    	Revenue, 
    	DateKey, 
    	Customer_Private_ID, 
    	Customer_Company_ID 
    FROM dbo.Fct_Revenue
    
    GO

    Please see copy of attached Excel 2010 workbook with data model and pivot.

    Concerning the stakeholder set up, it makes sense to me. Does it return correct results for you?

    Tom

    Attachments:
    You must be logged in to view attached files.
    #9247

    stefandrescher
    Participant
    • Started: 2
    • Replies: 11
    • Total: 13

    Dear Tom,

    thanks for the query and the workbook. I can’t stress enough how much I appreciate your effort. I am currently a one man show who is learning everything by doing. So it is great to have somebody to discuss this.

    I am on vacation this week and for some reason I can’t download the workbook. I will check it out as soon as I get the chance to. I think I understand the concept but I am not sure about what the ‘g|’ is doing. If this is not a function or something similar but a static value it would mean that if there are let’s say 5 transactions with no campaign but the same campaigngroup it will produce 5 times the same campaign ID and hence lead to many-to-many relationship if I relate the campaign ID to the fact table. But maybe I get it wrong and the workbook will clarify it.

    Best,

    Stefan

    #9248

    tomallan
    Keymaster
    • Started: 9
    • Replies: 2377
    • Total: 2386

    Stefan,

    The purpose of the “G|” as a prefix to an ad-hoc creation of a ID column value, is only to remind anyone following this discussion, that creating an ID value from another ID runs the risk of a value collision. Here, “G” represents that the new ID is borrowed from the a group column and the “|” is a symbol with extremely low usage. Together the two characters, plus the borrowed group ID column, are meant to ensure a unique campaign ID.

    There is no possibility of such a ad-hoc value creating a many-to-many relationship. The reason this particular technique was chosen is ease of implementation in both parent and child tables.

    I understand you are a one-man show and from your comments I hear some frustration. It would be great if your employer would invest in an online or live PowerPivot course/workshop for you. If there is a power pivot user group with regular meetings anywhere close by to where you work or live, I recommend such highly (as long as the user group supports and encourages dialog between members).

    Also, for other following this post who do not have experience with SQL, I have attached to this comment another Excel 2010 workbook that uses Power Query to reach the same end.

    Tom

    Attachments:
    You must be logged in to view attached files.
    #9378

    stefandrescher
    Participant
    • Started: 2
    • Replies: 11
    • Total: 13

    Hello Tom,

    for some reason I have trouble to edit posts. When I edit my post and submit it again it disapears. If I try to post it again it does not show up anymore. Maybe this has to do with diplicate content…

    You have been right. There is no possibility that your proposed ad-hoc value creation creates a many-to-many relationship. I think half of my brain was already on vacation.

    I spent the last few days to get this working. I had to tweak your query a little because I also have campaigngroups that have several campaigns with AND without campaignIDs. So they have both.

    SELECT
    
                Campaigngroup_Name AS CampaignGroup,
    
                CASE
    
                            WHEN Campaign_ID IS NULL
    
                                       THEN 'G|'+ CAST ( Campaigngroup_ID AS NVARCHAR (255) )
    
                            ELSE
    
                                       Campaign_ID
    
                END AS Campaign_ID,
    
                CASE 
    
                            WHEN Campaign_Name IS NULL
    
                                       THEN Campaigngroup_Name
    
                            ELSE
    
                                       Campaign_Name
    
                            END AS Campaign
    
    FROM dbo.Dim_CampaignGroup_New
    
    UNION
    
                CampaignGroup,
    
    'G|'+ CAST ( Campaigngroup_ID AS NVARCHAR (255) ) AS Campaign_ID,
    
    ‘N/A’ AS Campaign_Name
    
    FROM dbo.Dim_CampaignGroup_New

    After doing so I had the issue that CampiagnIDs were created that have already been created in the first part of the query. So I had to remove those:

    WITH Camp_CTE AS
    
    (SELECT
    
                Campaigngroup_Name AS CampaignGroup,
    
                CASE
    
                            WHEN Campaign_ID IS NULL
    
                                       THEN 'G|'+ CAST ( Campaigngroup_ID AS NVARCHAR (255) )
    
                            ELSE
    
                                       Campaign_ID
    
                END AS Campaign_ID,
    
                CASE 
    
                            WHEN Campaign_Name IS NULL
    
                                       THEN Campaigngroup_Name
    
                            ELSE
    
                                       Campaign_Name
    
                            END AS Campaign
    
    FROM dbo.Dim_CampaignGroup_New)
    
    SELECT *
    
    FROM Camp_CTE
    
    UNION
    
                CampaignGroup,
    
    'G|'+ CAST ( Campaigngroup_ID AS NVARCHAR (255) ) AS Campaign_ID,
    
    ‘N/A’ AS Campaign_Name
    
    FROM dbo.Dim_CampaignGroup_New
    
    WHERE 'G|'+ CAST ( Campaigngroup_ID AS NVARCHAR (255) ) NOT IN
    
                            (SELECT Campaign_ID
    
                            FROM Camp_CTE)

    This led to the issue that now there were CampaignIDs created for CampignGroups that had no transaction with a CampaignGroup but no CampaignID. So I needed to include this check in the query.

    WITH Camp_CTE AS
    
    (SELECT
    
                Campaigngroup_Name AS CampaignGroup,
    
                CASE
    
                            WHEN Campaign_ID IS NULL
    
                                       THEN 'G|'+ CAST ( Campaigngroup_ID AS NVARCHAR (255) )
    
                            ELSE
    
                                       Campaign_ID
    
                END AS Campaign_ID,
    
                CASE 
    
                            WHEN Campaign_Name IS NULL
    
                                       THEN Campaigngroup_Name
    
                            ELSE
    
                                       Campaign_Name
    
                            END AS Campaign
    
    FROM dbo.Dim_CampaignGroup_New)
    
    SELECT *
    
    FROM Camp_CTE
    
    UNION
    
                CampaignGroup,
    
    'G|'+ CAST ( Campaigngroup_ID AS NVARCHAR (255) ) AS Campaign_ID,
    
    ‘N/A’ AS Campaign_Name
    
    FROM dbo.Dim_CampaignGroup_New t1
    
    JOIN dbo.Fct_Revenue t2
    
    t2.Campaign_ID=t1. Campaign_ID
    
    WHERE 'G|'+ CAST ( Campaigngroup_ID AS NVARCHAR (255) ) NOT IN
    
                            (SELECT Campiagn_ID
    
                            FROM Camp_CTE)
    
    AND t2.Campaign_ID IS NULL

    This solved all the issues and the model runs as expected. I am also now able to display CampaignGroups that did not generate any revenue using the option to display these as you proposed above. Also there is no need for complicated measures anymore. SUM and the like solve most of requirements. This also makes things fast.

    I am really happy with the result and I can thank you enough for the time and effort you spent helping me out on this!

    Do think it would be worthwhile to rename the topic of this post? I am not sure if it really covers what we have been discussing so far. Finding this thread on google would have saved me a lot of time and it seems to me that this must be a common issue – at least the ad-hoc value generation part of it. On the other I wouldn’t have any good suggestion for a new name.

    By the way, I was able to open the excel workbooks you attached but I wasn’t able to open the powerpivot model as it said that it was created in a newer version of powerpivot.

    I will now start working to get all 3 stakeholder dimension into one. It produces the right results in the current setup but it would be a lot more intuitive for the user to have these in one dimension as they are used hierarchically.

    Tom, thanks a lot again for your great effort! It really took this model to the next level.

    Stefan

     

    #9379

    stefandrescher
    Participant
    • Started: 2
    • Replies: 11
    • Total: 13

    I just realized that there a couple of typos, formatting issues and missing words in my post. I apologize for that but I do not dare to edit it again…

    #9380

    tomallan
    Keymaster
    • Started: 9
    • Replies: 2377
    • Total: 2386

    Stefan,

    Thanks for following up!

    Tom

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic.