Introducing Python in Excel - ID Card Make

Introducing Python in Excel - ID Card Make My God, okay, it's happening! Everybody, stay calm - what's the procedure everyone? So, Python for Excel is here and it's ready to test in the Beta channel of Office 365. Excel has been getting a lot of updates, but this... This is going to put it in a different league. So, you might be thinking, "That's not for me. I'm not a programmer, I'm not a geek. I just do my daily Excel stuff. I don't need this." Wait.

Introducing Python in Excel

Until you see what I'm going to show you, and I recommend that you watch the whole thing and take some time to digest it, because I didn't want to give you a stroke. So, I've sprinkled all the cool stuff throughout the video. So, sit back and let's start from scratch. Okay, so when you open up Excel and you have Python, you are going to find it in the Formulas tab. There is a Python section here. You can get started by inserting your own.

Custom Python formula or exploring Python samples. And then, start by taking this tour. But we are going to get started together right now, and I'll show you how this works. The first thing we need to do is get into Python mode. One way we can get there is by going to Insert Python and then inserting a Custom Python Formula. Now, keep your eye on the formula bar and see what happens when I select that. The box changes into a Python box. Now, another way you can.

Get in Python mode is by just typing in "=PY". This function creates a Python formula. So, the moment I press Tab, I'm already in Python mode. What happens now? Well, let's say as our first example we want to give some insights from this dataset. If you're working with data, the first step you need to do is to send your data to Python, because it needs to be able to see it to work with it. The way we can do that here is by just referencing the cell. I'm just going to.

Select it, use the keyboard shortcuts, Ctrl + Shift and the down arrow key, to select the whole range. And notice what we get in the formula bar. We have our data range, "headers=True". That was automatically picked up, and everything is wrapped inside an XL function. Now, when I press Enter, nothing happens. What Enter does is it adds new lines to my Python script, because Python scripts could get long. So, it makes it easier to work with these scripts. So, for me to be able to.

Run this, instead of Enter, I need to press Ctrl + Enter, and I get what is called a DataFrame. If you know Python, you know what this is. If not, it's basically a condensed version of your table. It's a two-dimensional data structure that's all sitting in a cell. Now, take a look at this. Here we have a dropdown. What is being returned here is a Python Object, that's our data frame. But I can switch my view and return Excel Values. When I do that, I get my data set back. But this.

Takes up a lot of space, so I'm going to switch it to a Python object and get my DataFrame. Now you can think of this DataFrame as an Ikea table. Right, when you buy a new table from Ikea, it's all condensed in a single package. If you want to get a glimpse of what's inside, you can click on this card, and you get to see the beginning rows of your data set and the ending rows. This DataFrame is a fundamental structure of a specific Python.

Library that's called Pandas. The Pandas Python library is great for data analysts, and you need to know the name. I know it sounds weird, but you need to know it if you want to look up other functionalities in Python's library. But I'm going to explain more about libraries and Pandas later. What I want to show you right now is some of the cool things you can do here. So, for example, I'm going to go to another cell, go to Python mode by typing in "=PY"..

I'm going to reference my DataFrame and then use the Describe method. So, I'm just going to type in ".describe". So, I'm in Python mode now, not Excel mode. I'm going to open and close brackets, press Ctrl + Enter to run this, and I get a DataFrame back. But I want to see the Excel Values. I'm going to switch my view to Excel Values, and I get to see some information about my data set. So, I have the Quantity and Sales columns.

Here. This is the count of each column, the mean, standard deviation, and so on. Now, if we take a look at our data set, we have date, product, quantity, and sales. Now, let's say I wanted to describe the Product column. I'm just going to go and adjust my function. So, right after I reference the DataFrame, which is the data set here, I'm going to put in my column "Product". The way you reference columns is by opening up the square brackets, similar to.

Referencing Excel tables, and then typing in, in quotation marks, the name of the column headers. In this case, "Product". And I'm going to close the square bracket, press Ctrl + Enter, and I have some information about products. So, I have 96 products, 15 are unique. The top is "Laptop Bag Red" because that occurs 15 times. It's a quick way of getting insights from your data. Now, if you're planning on using the data that you uploaded to Python and doing different types.

Of calculations, it helps to give it a name. This way, you don't always have to reference the cell. The way you do that is by going to your first DataFrame and then just typing in any name, as long as it doesn't have spaces in it. So, I'm going to call it "df" for DataFrame, equals this data set. So, when I run this right now, nothing changes. All I've done is just give this a name. But when I come to do other types of calculations based on this DataFrame, I no longer.

    Need to reference it by referencing the cell. Instead, I can just type in "df", and that's it. - ID Card Make

    Now, talking about shortcuts, another shortcut that I like is to avoid these square brackets. So, whenever I can avoid them, I will. And the great thing with the Pandas library is that it transforms your headers into attributes. So, as long as you don't have spaces in the headers, you can actually type them out like this. So, I can mention the DataFrame 'dot' the header 'dot' describe, and when I run this,.

    I get the same thing. But if they happen to have spaces, you need to use the square brackets. Now, there are lots of functionalities in the Pandas library. So, some simple ones that you're familiar with in Excel is, for example, the SUM or the AVERAGE function. So, let's say I wanted to use Python and sum the sales column. I'm going to go to Python mode, but this time using my shortcut keys Ctrl + Alt + Shift + P. Now, this sounds like a complicated shortcut, but it's actually easy.

    To reach for on the keyboard. Then, I'm going to mention the DataFrame. I gave it a name, it's df, 'dot', I want to sum the sales column. So, I'm going to go with ".sales", it doesn't have any spaces. I can reference it in this way. .sum method. Run this, and I get the total of my sales column. So, let's just double-check this. That's 174510. So, if I sum this, 174510. Now, if I want to get the average, I'm going to use mean here. And that's the average..

    All of this you can do with simple Excel formulas, right? Why do we need Python? So, let's take this a step further. Let's say I want to get my total sales values for each date. Notice, my dates are repeated, but I want to aggregate them for each date, similar to what you would do in a pivot table. I'm going to go to Python mode, then reference my DataFrame, which was df, .groupby I want to group everything by the date.

    Column. And I want to have the total values from sales. I'm going to go with .sales, .sum, and I'm going to press Ctrl + Enter. And I'm going to get my series returned as a Python object. But if I want everything as Excel values, I'm just going to spill them, and I have the total sales for each date. And if you don't believe me, let's just double-check this. So, this is for the first of June, and I have these four numbers here. That's 9680. That's 9680..

    Now, what if I change my mind and I want everything on a monthly basis? Well, check this out. I'm going to go inside groupby, and I'm going to use a class called pd.Grouper. My key equals the date column, and my frequency equals, I'm going to put M here, and close the bracket. Run this, and I get my sales on a monthly basis. So, total sales for June should be 38060. Let's just double-check this. And we have 38060..

    Now, okay, you might say, "Well, a pivot table can do this." The great thing about this is that these are formulas. So, if I change this value to let's say 6000. Keep an eye on the June value here. And I press Enter, all of these refresh immediately. I'm just going to press Ctrl + Z to go back. Something to keep in mind is that you always get the last expression returned. So, you could be writing a lot of code, but whatever you end up seeing in the cell is based on the last.

    Expression. Let me show you something even cooler. I'm going to give this a name. I'm going to call this "forchart," and we are going to put a chart in a cell. So now, I've called this series "for chart," and I want to plot this series. I'm going to do forchart.plot. I want to put a line chart. My x-axis is going to have my dates, which are in the date column. My y-axis is going to have my sales values. And the kind of chart I want to plot is a line chart. I'm going to run this,.

    And I get a tiny little line chart in a cell. You probably really can't see this, so let me go and merge the cells, and we get to see our monthly sales values. How cool is that? If I wanted an area chart, I'm going to just change this to "area," and I have an area chart. If I wanted frequency to be on a weekly basis, I'm going to change this to "W," and I get my weekly sales data shown as an area chart. I prefer line charts. I'm going to change that. So, notice.

    That the last expression is the one that plots the chart for me, and that's what's returned.

    Now, if you're wondering how I'm coming up with this syntax, how do I know these classes and the methods? Well, there is great documentation on this in the Pandas library. You have lots of information here on this webpage to help you get started and lots of code to play around with. Plus, you've got AI. There is Bing Chat and ChatGPT that can.

    Help you when you get stuck. Just double-check your results to make sure it's what you want. Now, before I move on and I show you some even more cooler stuff, there is an important thing to keep in mind, and that's the calculation logic of Python Cells. Python Cells calculate from left to right and top to bottom. So, notice here, first, I've created the DataFrame, and here, I'm referencing the DataFrame. I'm on the right side. Here, I'm below it. But if I copy.

    And paste this here, I'm going to end up with a hash value. I get diagnostics pop up here, and it tells me "name 'df' is not defined." That's because Python Cells calculate from left to right and top to bottom. I would have to bring my DataFrame here for everything to calculate fine. So, I'm just going to demonstrate that. I'm going to Ctrl + X and Ctrl + V this, and now, everything calculates fine..

    Now, let me show you some cool examples from the default libraries that we have. So here, I'm using pd.melt function to restructure this data set. So, I have quantity and sales in two separate columns. When I use pd.melt, I can bring them in the same column. So, this is similar to "unpivot" in Power Query. And look at the formula bar here. It's very simple to write this. I just need to mention my non-value columns, which are date and product,.

    And my value columns, which are quantity and sales. And it does the whole restructuring for me. Now, another cool one is query. Take a look at this. I'm going to go into Python mode, reference my data set, which is df, right? That's my DataFrame that I uploaded to Python. And I'm going to use query. Then, I'm going to put this in. What this does is it takes a look at the sales column. It grabs everything that's above 2000 and takes a look at.

    My product column to see if anything contains the word "black". So, this is similar to the QUERY function in Google Sheets, which is also similar to the syntax from SQL. When I run this, by default, I get a DataFrame back. But I'm going to show the Excel values, and I get everything that has the word "black" and sales is above 2000. Now, let's move on to something else. We can create charts like this. I have small multiples.

    Returned as a single object. Take a look at this one. This is a bit more complicated, but I also wanted to show you that you can write longer scripts and get Python to do what you want. So here, I just have dates and students from 2018 to 2022. And what I wanted to do here was this: I want to return the last four years, and I wanted to arrange the charts beside one another. I also wanted to make sure that the y-axis is showing the same values,.

    Right? So that I can easily compare each of these with one another. Now, I've even added labels to this. And you can do all of that in Python here. You can set the title, set the label, and so on. And yeah, I didn't do all of this on my own. I didn't know I had to flatten the axis, so I asked ChatGPT for help, and it helped me get what I wanted right here. Now, this is my favorite part of the video. What we're going to do is this: We have an Excel table,.

    And we have some feedback in each row. What we want to do is grab any URLs that we have inside each row. The tricky part is that they might be in the middle of the text, and the pattern isn't so obvious. So, sometimes they start with www, sometimes we don't even have that, they just end with .com. Sometimes we have https and some might just have http. Some might not even have a link. What we want to do is just grab any URLs in there. So here,.

    You can see we're grabbing both of these in the cell, and then we're omitting this one because it didn't have any links. Then we have this and this. How can we do this? The great news is you can import any Python libraries or modules that might not be there by default. So, the ones we used are there by default: the Pandas library for data analysis, Matplotlib for data visualization..

    But Python has so many different libraries and modules. There are ones for AI, for web, for regular expressions. So this RE that I'm importing here works with regular expressions. All I have to do is go to Python mode and import the library or module, and I've returned "re loaded." That's just because I want to see something in the cell aside from none, which is what I would see if I don't have this. Everything still works, but I'm going to see none or 0 in the cell..

    Right, so I'm just going to press Ctrl + Z to go back. Now, after I imported the library, what I did is create a data frame based on the table. And the way you do that is similar to before, except you just have to make sure that you have the right referencing for the table that you are including everything. So, I have "#All" to make sure that I'm also including the header. Now, when I hover over the DataFrame.

    Here and click on the card, I can see that the column header is a part of my DataFrame. Now, the next step is to write some Python code to get this done. What is done here is that I've combined the patterns using this. This is based on the patterns that I can see in my data set. This makes sure that anything that ends with .com is included, that anything that has https, http, or www is also included. Then, there is a function here that grabs the URLs,.

    And this function is run on each single cell, so on each single text that I have in my DataFrame. You can also see the syntax here. Wherever you come across the hash sign, this means that this is a comment. This is a good way of also testing your code. You can comment things out to see what is the last expression that's returned or just add in some documentation to your code. Now, the end here, we're just filtering out any rows that don't have any URLs. Now, the great thing.

    Is because it's referencing the table, it is dynamic. So, if I go in and add another URL, so let's say youtube.com, this is going to be automatically added to my URL list here. Now, if you're wondering whether you can connect Python to your Power Queries, yes, you can. You don't need to have the data physically loaded in your workbook. They can be there via Connection Only, and you can create a DataFrame if you want based on these and use Python code on them. Now,.

    DISCLAIMER: In this description contains affiliate links, which means that if you click on one of the product links, I'll receive a small commission. This helps support the channel and allows us to continuetomake videos like this. All Content Responsibility lies with the Channel Producer. For Download, see The Author's channel. The content of this Post was transcribed from the Channel: https://www.youtube.com/watch?v=FbBXtqtRnWU
Previous Post Next Post