Soon you'll probably find it easier to just type everything from scratch but no don't do that no copy paste and no typing from scratch instead you're gonna do this.
you're gonna open up excel and select blank workbook then go to data get data from file from folder now you get to pick the folder in which.
You have all of your pdfs saved in mine is sitting right here in the folder sales pdf i'm going to select that and open now in this view that opens up we get to peek inside that folder these are.
All the different files that are saved there now just click on transform data here we can see more information about the file we can see the extension the date it was modified date created and.
Some more attributes we don't see the content of each file until we click on this double down arrows with this we get to combine the contents of all these files so power.
Represent the sample but i'm fine with this i'm going to keep it dynamic and go with the first file this is the content of the file i get a table view and the page view in this case these are the.
Same views so i'm just going to pick the table one and click on ok now power query is going to go ahead and create some transform routines based on that sample file and then it's.
Going to apply those to each single file and finally it's going to append the results so all of the stuff that appeared here was done automatically by power query what i get.
Here is the file name which can become useful if you need to keep that in your final report we get the different columns in the file so we have date properly formatted as date first name.
Last name our text quantity formatted as a whole number and sales value is a decimal number and i can update this to the currency format all my information is appended together all the steps are.
Posts Related:
Recorded by power query all i need to do
Is send this data to the workbook but wait before i do that i don't want this column to look like this i just want to grab the store location from this part.So i'm going to double click and update the header to store now to grab this part where it says east and north let's use a transformation let's go to.
Transform extract text between delimiters my start delimiter is an underscore but the problem is that i have two underscores here i want the last one so.
I'm going to click on advanced options and scan for the start delimiter not from start of the input but instead from the end of the input so we're going to look backwards until we get to the first.
Underscore for end delimiter that's going to be dot pdf that's it click on ok and we've extracted our store location we're done here home and send this to the workbook let's just adjust.
The formatting of our table make this bold and add a bottom border okay so now we have all of our information all the 330 rows of data from multiple pdfs combined.
Into a single excel sheet now what happens when i get more data so i just have data until july let's bring in august data we just received a new pdf file from our north.
Store so i'm just gonna drag this and drop it inside the sales pdf folder now let's switch to excel i'm just going to go all the way down right mouse click and refresh.
We should see august data pop up and it was super fast everything was automatically appended now what if you want to make some adjustments to this what if you want to combine first name.
And last name together well no problem
Let's just double click on our consolidated query and do it right here select the first name column hold down control select last name right mouse.Click merge columns the separator between first and last name should be a space and i'm going to call this full name and ok and let's just send this back to our.
Workbook and we will have first name and last name combined into full name let's also update the formatting of this now what happens if i drag and drop a file in.
This folder that's not a pdf file for example i have this word file here let's just drag and drop it here when i go to my query here right mouse click and refresh.
I get corrupted i have a problem obviously the way to solve this is just to go here and remove my word document from this folder but i don't want to do that because i know i can't really trust.
My colleagues to remember not to save their other files inside that folder what i'm going to do is to improve my query i'm in my consolidated query here let's.
Jump back to the source step here we can see the file extension for each file in the folder now we can't see that word file let's just refresh the preview and it should pop up it's.
Right here what i want to do is to add a filter and only filter for pdf files so notice i'm back at the source step and i want that filter to occur immediately before any other steps are applied so.
I'm going to click on the drop down here go to text filters ends with let's insert a step and type in dot pdf now if you want to be on the safe side.
That your extension could be in uppercase letters instead of lowercase you can apply a change case step as well but in this case i'm only expecting dot pdf to be in lower case so i'm going to.
Apply that and then everything should run smoothly let's go close and load and our query updates without problems right now we might be thinking this is foolproof nothing can mess up this query.
But not too fast we have another colleague who happened to put in this excel file so let's just drag and drop it in our folder so now we have this excel file in here as well we did fool.
Proof this right so if i refresh this query i shouldn't run into any problems but i immediately get that corrupted message again why is this happening let's open up our query when i go back.
To the source step let's refresh the preview i can see excel there once i filter it everything is gone but why is that error still there.
Well remember all of these transformations are based on a sample file and originally we selected the first file in the folder to be the sample file which now has become our.
Excel file so if i refresh the preview here it's going to fail to load because it can't read it properly we need to filter out any known pdf files at the sample file level as well so we're going.
To go to sample file here notice there is a source step and if i refresh the preview we can see that first file has our xlx extension we need to apply a filter for extension here as well let's.
Go to text filters ends with we're going to insert a step and it ends with dot pdf and click on ok and this way only pdf files are going to show up so when we pick the first file.
By default that's going to be a pdf file so let's just refresh the preview we can already see that it's working here now we can be sure that only pdf files are included okay so i've already done a.
Separate video that goes into a lot more detail about importing data from a pdf file so in case you run into any problems when you're importing your pdf check out.
That video you might find a solution to your problem there now remember this is a feature that's available in office 365 and office 2021 so if you don't have it that's probably.
Why okay so that's it for today let me know if you've ever imported pdfs to excel like this and what your experience with it has been so far i'd love to know as always.
Thank you for being here thank you for watching subscribe if you aren't subscribed yet and i'm gonna see you in the next video [Music].