Home Forums Power Pivot removing duplicate rows with Power Query

This topic contains 2 replies, has 2 voices, and was last updated by  Danl49 2 years, 1 month ago.

Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
    Posts
  • #5511

    Danl49
    Participant
    • Started: 1
    • Replies: 1
    • Total: 2

    On page 196 of Power Pivot and Power BI (Collie and Singh), Power Query is used to split a flattened table into two separate tables, linked by the Customer ID. The technique is the same one I have used many times, but I have a question, so please bear with me.

    Using Access (and most other relational databases), the current address is assumed to have been extracted. My experience is that the extraction often uses the address valid at the time of the transaction, and that customers tend to move. The suggested method works as long as the data extraction was set up to extract the current address, my files give me the address at the time of the transaction, so removing the old addresses is an important step.

    The removal of duplicates, when using Excel, seems to be to find the first instance of the customer in the table (including the address). Excel “holds” this first row, and then deletes any additional rows referring to the same customer. Thus, to ensure that the most current address is kept, it’s important to sort the data to have the most recent address appear first for each customer first. Generally, this is as simple as sorting by customer ID, and then by order date (with the newest order on top).

    My question is: does Power Query use the same method – i.e. “hold” the first instance for each customer ID, and then delete any additional references to the same customer ID?

    It seems to, when I test some data, but that is limited to my data and might not be true in all cases. Comments welcome.

    #5512

    tomallan
    Keymaster
    • Started: 7
    • Replies: 2461
    • Total: 2468

    Hi,

    Here is a link to a technique used with success (including myself) that I think will answer your question about removing duplicates:

    https://social.technet.microsoft.com/Forums/en-US/00a09332-fe6e-47a4-a8a9-b71e38b01a86/how-to-remove-duplicates-based-on-sort-order?forum=powerquery

    This link was originally posted in this forum by MikeChina on October 2, 2015 at 5:02 pm.

    #5560

    Danl49
    Participant
    • Started: 1
    • Replies: 1
    • Total: 2

    Perfect!

    Thanks for the quick and accurate response.

    Dan

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic.