Skip to article frontmatterSkip to article content

5. Anaconda Toolbox and Code

Authors
Affiliations
Microsoft
Anaconda

Necessary Tools

  • Python in Excel
    • Should be available in most versions of Excel
    • Try typing =PY and hit tab
  • Anaconda account
    • Go to anaconda.com and register if you don’t already have an account
  • Install AnacondaToolbox add-in
    • Excel -> Home tab -> Add-ins
    • Search for “anaconda”
    • Click AnacondaToolbox, then install
    • Verify Anaconda group on Formulas tab

Anaconda Toolbox

https://www.anaconda.com/docs/tools/excel/toolbox/main

The Toolbox is designed to support Python in Excel.

It includes four main functions:

  1. Visual Chart Builder
  2. AI Assistant
  3. Sharing Datasets
  4. Sharing Code Snippets

We will only cover the Visual Chart Builder today.

Visualizations

Anaconda Toolbox provides a point-and-click interface to build a variety of seaborn charts.


1

First, create a new Excel workbook.

Let’s utilize Excel’s builtin Python in Excel guide to add the Iris Dataset.

Click Formulas -> Insert Python, then Tour Samples, and Insert the scatter plot example.

This should give you a new sheet named Python sample with a Table named IrisDataSet2.

It automatically created a scatterplot.

Go ahead and close the Python in Excel samples dialog.


2

Let’s recreate the scatterplot from the samples using Anaconda Toolbox.

  1. Open the Anaconda Toolbox from the Formulas tab.
  2. Click on Visualize with Python.
  3. Click New Chart.
  4. Choose Scatter Chart.

3

We need to choose the dataset for the chart.

Click the Select input field next to Data.

An option appears to select defined tables or to manually choose a range. Let’s select the IrisDataSet2 table.


4

The X- and Y-Axis are automatically populated from the input data.

Choose sepal_length for the X-Axis.

Choose sepal_width for the Y-Axis.


5

Click on the Preview tab.

This brings up a preview of what the chart will look like before we write the code to the spreadsheet.

We’re missing a title, but otherwise this looks nearly identical to the sample chart.


6

Click on the Code tab to see the code which our point-and-click interface is generating.

You normally don’t need to bother with this, but it’s available for both:

  • Learning about seaborn code and how UI options map to lines of code
  • Manual editing of the code to refine the chart beyond what the UI allows

7

Let’s add the title and send the chart to the grid.

Click the Design tab.

Set the Title to Sepal length and width analysis.

Now click Create. You will be prompted where you want to drop the chart within the spreadsheet.

Choose cell G28.


8

Look at the formula in cell G28. It should match the Code tab we saw earlier.

This code is being sent to Python in Excel and produces a chart.

Click on the Create Reference hover option to make the chart bigger and position it somewhere over the table data.


9

The chart can continue to be edited.

Let’s add more depth to the chart by coloring the points by species.

Go the the Setup tab.

Open the Color By option and choose species.

Click Apply and watch the chart in the spreadsheet update.

A legend has been added to explain the coloring.


Personal Exploration Time!

10

Take some time and try some of the following:

  1. Update the marker styling of the points so they can be distinguished in black-and-white.
  2. Position the legend in the upper-left corner.
  3. Create a new Pairwise Chart. Choose sepal and petal widths and lengths for the variables. Color by species.

Anaconda Code

https://www.anaconda.com/docs/tools/excel/code

Anaconda Code is bundled with the Toolbox, but has a different purpose than the toolbox.

Instead of complementing Python in Excel, it offers an alternative way to run Python within Excel using pyodide.

Pyodide lets you run Python directly in a browser, which is where modern Excel add-ins run. The side panel is an embedded webview.

Why is this useful?

  1. Local execution == faster execution (often)
  2. Ability to install new packages into the pyodide environment
  3. New run mode --> dependency order instead of row-major order
  4. User-defined Functions

11

Open up Sheet1.

Open Anaconda Code in Formulas -> Anaconda -> Code.

Click Create Code Cell. Place the cell in A1.

Enter the code

1+1

Press Ctrl+Enter or click the green Run button.

NOTE: If you get a #NAME? error, it means Anaconda Toolbox didn’t register the function =ANACONDA.CODE() properly during installation. The easiest solution is to uninstall the add-in and re-install.


Add Packages

12

Click the Environment tab (3rd icon) so we can add some libraries to our environment.

Click Edit, then Add.

Search for networkx and Add. Then search for polars and Add.

Click Add Packages. Then finally Save Changes.

A progress indicator shows the status of package downloads and installation. This might take a minute or two the first time Anaconda Code is run, but will be faster in the future as packages are cached.


13

Let’s add imports for those packages.

Click on the Imports and Definitions tab (2nd icon).

Add the following imports

import networkx as nx
import polars as pl

14

Click on the Home icon.

It still shows the code for cell A1.

Click Back to see a list of all defined Anaconda Code cells.

Then click the green + button to create a new code cell.

Choose cell B2 and click OK.

The Python editor opens and we can start writing code.


15

Enter the code to convert the Iris table to a polars DataFrame.

Start first by typing

data = REF(

A list of all tables defined in the notebook will show as autocomplete options. Hit tab to accept the Iris dataset.

Finish adding the code:

data = REF("IrisDataSet2[#All]")
df = pl.DataFrame(data[1:], schema=data[0], orient='row')

By default, the result spills to the grid.

The printed repr also displays in the log viewer below the code.


16

Anaconda Code supports the same output modes as Python in Excel.

Change the output type from Excel Value (#) to Python Object (</>).

Re-run the cell to see the change from spilled data to a card view.

Note: Anaconda Code uses </> as the default icon instead of [PY] to avoid confusion with Python in Excel cells.


17

Let’s use polars to do a simple groupby and column renaming.

Create a new Python cell by clicking Back, then the green +. Place the cell in B4.

Start by grabbing a reference to the DataFrame.

df = 

Click the REF button in the lower-right corner and select cell B2.

Finish by adding the group by and aggregation code.

df = REF("B2#")
df.group_by(pl.col('species').alias('Species')).agg(
    pl.len().alias('Count'),
    pl.col('sepal_length').mean().alias('Sepal Length'),
    pl.col('sepal_width').mean().alias('Sepal Width'),
    pl.col('petal_length').mean().alias('Petal Length'),
    pl.col('petal_width').mean().alias('Petal Width'),
)

Now we have a nice summary of the average lengths and widths by species.


Isolated Run Mode

18

You may be asking why we needed to redefine df?

The reason is that the default execution mode of Anaconda Code is Isolated.

This means that Python cells don’t share local variables with each other. To get access to the output of a cell, you must explicitly use REF(). This is very similar to xl() for Python in Excel.

Why make this the default mode? It avoid unnecessary recomputation.


19

Let’s demonstrate with a fun example. Mandelbrots!

  1. Enter a plain 128 in cell A10
  2. Create a new Anaconda Code cell in B10
  3. Paste in the code
import matplotlib.pyplot as plt
import numpy as np
import io
from PIL import Image

def complex_matrix(xmin, xmax, ymin, ymax, pixel_density):
    re = np.linspace(xmin, xmax, int((xmax - xmin) * pixel_density))
    im = np.linspace(ymin, ymax, int((ymax - ymin) * pixel_density))
    return re[np.newaxis, :] + im[:, np.newaxis] * 1j

def is_stable(c, num_iterations):
    z = 0
    for _ in range(num_iterations):
        z = z ** 2 + c
    return abs(z) <= 2

def get_members(c, num_iterations):
    mask = is_stable(c, num_iterations)
    return c[mask]

c = complex_matrix(-2, 0.5, -1.5, 1.5, pixel_density=REF("A10"))
plt.imshow(is_stable(c, num_iterations=20), cmap="binary")
plt.gca().set_aspect("equal")
plt.axis("off")
plt.tight_layout(pad=0)

# Save image
buf = io.BytesIO()
plt.savefig(buf)
buf.seek(0)
img = Image.open(buf)
img
  1. Create a reference to this so it’s easier to see

20

Notice that line 21 contains REF("A10").

The pixel density of the mandelbrot image is based on the value in cell A10.

Change A10 to 5, then 8, then 32, then 512 to see the impact.


21

You probably didn’t notice, but the Polars group by calculation never updated while you made those changes.

If this were Python in Excel, the Polars DataFrame and group by would have recalculated every time.

But in isolated mode, only cells with a REF() to the changed cell get run.

This is standard Excel behavior, and it is the default way which Anaconda Code runs. No need to think about row-major order or have calculations slow down when there are lots of Python cells in a spreadsheet.


User Defined Functions (UDFs)

22

User-defined functions are defined in the Imports and Definitions, and can be called directly from Excel without creating a new Python cell.

  1. Click on the Imports and Definitions tab
  2. Add this code below # Define classes and functions
@UDF(name='ORG.STRUCTURE', nested=False)
def org_structure(elist: UDF.Range, start, end, header=True):
    if header:
        elist = elist[1:]
    G = nx.from_edgelist(elist)
    return nx.shortest_path(G, start, end)
  1. Click Apply

What does this do?

It registers the function org_structure with Excel so that it can be called as =ORG.STRUCTURE().


23

Let’s add some data to work with.

Add data to cell K1 using copy, then Paste Special -> Text

={"Manager","Employee"; "Alice","Bob";"Alice","Caitlyn"; "Alice","Doreen"; "Bob","Ellis"; "Bob","Fred"; "Bob","Gregory"; "Caitlyn","Heidi"; "Doreen","Ishmael"; "Doreen","Jaoquin"; "Doreen","Katherine"; "Doreen","Lynn"; "Fred","Mary"; "Fred","Nora"; "Ishmael","Olive"; "Jaoquin","Peter"; "Jaoquin","Quinn"; "Nora","Ross"; "Nora","Stephen"; "Nora","Telly"; "Quinn","Ursula"; "Quinn","Vanessa"; "Bob","Walter"; "Walter","Xavier"; "Peter","Yane"; "Vanessa","Zoe"}

Enter in cell N2

=ORG.STRUCTURE(K1#, K2, L26)

Okay, what is happening here?

We have a manager-employee listing, which is known as an edgelist in graph theory. We can construct the graph using networkx and call the shortest_path algorithm to find the path between any two employees.

Importantly, we call it like a normal Excel function and pass data in like a normal function.

It just happens to call the function we defined in Imports and Definitions.


24

Because it is a now a normal Excel function, we can mix and match with other Excel functions.

Let’s make the result spill horizontally rather than vertically.

Modify cell N2

=TRANSPOSE(ORG.STRUCTURE(K1#, K2, L26))

Wrap Up

  • Anaconda Toolbox attempts to make Python in Excel more useful
    • Visual Chart Builder
    • AI Assistant
    • Sharing datasets and code snippets with colleagues
  • Anaconda Code reimagines how Python in Excel could work
    • Local runtime
    • Ability to install additional libraries
    • Isolated run mode
    • User-defined functions (UDFs)
    • Supports R in Excel