神刀安全网

[工具资源] Python 调用 COM 的 API 实现 Microsoft Office(PPT, Excel, Word) 办公自动化

Python调用COM的API实现Microsoft Office(PPT, Excel, Word)办公自动化

In [3]:
from IPython.core.display import Image pic = lambda p : Image(filename='./resPPT/%s' % p)

Automating Microsoft Office with Python¶

Windows applications, for many years, have provided a COM API for automation. This includes Microsoft Office as well.

pywin32 is a library that lets you do many interesting things in Windows, including access these COM APIs.

For example, to open PowerPoint and draw a circle, this is what it takes:

Python实现Microsoft Office办公自动化¶

多年以前,微软就为Windows软件出了 COM 组件,便于调用其API实现自动化,也包括Microsoft Office。

pywin32 就是这样的库,能让你在Windows上做很多有趣的事儿,包括调用COM的API。

例如,要打开PPT画个圈儿,可以这么做:

In [17]:
import win32com.client  # Open PowerPoint Application = win32com.client.Dispatch("PowerPoint.Application")  # Add a presentation Presentation = Application.Presentations.Add()  # Add a slide with a blank layout (12 stands for blank layout) Base = Presentation.Slides.Add(1, 12)  # Add an oval. Shape 9 is an oval. oval = Base.Shapes.AddShape(9, 100, 100, 100, 100)

In [4]:
pic('ppt0.png')

Out[4]:
[工具资源] Python 调用 COM 的 API 实现 Microsoft Office(PPT, Excel, Word) 办公自动化

You’ll have to try this out to see the result, but just FYI, this will open a new PowerPoint window and add a slide with a circle in it.

This opens up a lot of opportunities for slideware. Similarly, we can open an Excel application, add a circle, and change a few cells.

你看,运行之后就能看到结果:打开一个PPT,新建一页,然后在上面画一个圈儿。

这为幻灯片制作开启了新思路。当然,我们也可以打开一个Excel,在上面画一个圈儿,向单元格添加内容。

In [6]:
# Open Excel Application = win32com.client.Dispatch("Excel.Application")  # Show Excel. Unlike PPT, Word & Excel open up "hidden" Application.Visible = 1  # Add a workbook Workbook = Application.Workbooks.Add()  # Take the active sheet Base = Workbook.ActiveSheet  # Add an oval. Shape 9 is an oval. oval = Base.Shapes.AddShape(9, 100, 100, 100, 100)  # In the first row, add Values: 0.0, 0.5, 1.0 Base.Cells(1, 1).Value = 'Values' Base.Cells(1, 2).Value = 0.0 Base.Cells(1, 3).Value = 0.5 Base.Cells(1, 4).Value = 1.0

In [7]:
pic('ppt1.png')

Out[7]:
[工具资源] Python 调用 COM 的 API 实现 Microsoft Office(PPT, Excel, Word) 办公自动化

This means one can go about creating Excel models directly with Python.

看到了吧,可以直接通过python建立Excel。

Picturing the IMDb Top 250¶

Let’s begin by creating a slide that shows all of the Top 250 movies on the IMDb.

First, let’s load all the movies.

画图——IMDb Top 250¶

让我们做点儿更好玩儿的,用PPT显示IMDb的 Top 250 电影

In [18]:
from lxml.html import parse  tree = parse('http://www.imdb.com/chart/top') movies = tree.findall('.//table[@class="chart"]//td[@class="titleColumn"]//a') movies[0].text_content()

Out[18]:
'The Shawshank Redemption'

We can show these movies. But before that, we can’t remember these numbers for the shapes (like 9 is for circle). Let’s pre-define those in line with how Office uses them and import them.

lxml 可以轻松获取数据( BeautifulSoup 也行)。但是,我们记不住形状对应数字(如9表示椭圆oval,5表示矩形rectangle),让我们先看看这些数字Office是如何用的。

In [79]:
# 使用win32com里面的makepy -d生成MSO, MSPPT import MSO, MSPPT g = globals() for c in dir(MSO.constants):    g[c] = getattr(MSO.constants, c) for c in dir(MSPPT.constants):  g[c] = getattr(MSPPT.constants, c)

Let’s draw each movie poster as a little box on a 25×10 grid. We don’t have the images yet, but first, let’s just draw the rectangles.

我们把每部电影的海报放到25列10行的栅格里,每个格子宽28高41。抓图片之前,我们先把格子都画好。

In [40]:
Base = Presentation.Slides.Add(1, ppLayoutBlank)  width, height = 28, 41 for i, movie in enumerate(movies):     # 25 columns one row, column step is 28     x = 10 + width * (i % 25)     # 10 rows one column, row step is 41     y = 100 + height * (i // 25)     r = Base.Shapes.AddShape(             5, # Shape Rectangle             x, y,             width, height)

In [8]:
pic('ppt2.png')

Out[8]:
[工具资源] Python 调用 COM 的 API 实现 Microsoft Office(PPT, Excel, Word) 办公自动化

It would be nice to get posters into those, so let’s scrape the posters.

海报放框里应该会好看点,下面我们抓海报。

In [16]:
import os from urlparse import urljoin from urllib import urlretrieve from hashlib import md5  # We'll keep the files under an img/ folder if not os.path.exists('img'):     os.makedirs('img')      def filename(movie):     '''Filename = MD5 hash of its title in UTF8'''     name = md5(movie.text_content().encode('utf8')).hexdigest()     return os.path.join('img', name + '.jpg')      for movie in movies:     if os.path.exists(filename(movie)):         continue              url = urljoin('http://www.imdb.com/', movie.get('href'))     tree = parse(url)     img = tree.find('.//td[@id="img_primary"]//img')     urlretrieve(img.get('src'), filename(movie))

Now, instead of just rectangles, we’ll use the posters.

现在,我们用这些海报来填充。

In [22]:
Base = Presentation.Slides.Add(1, 12)  width, height = 28, 41 for i, movie in enumerate(movies):     x = 10 + width * (i % 25)     y = 100 + height * (i // 25)     image = Base.Shapes.AddPicture(         os.path.abspath(filename(movie)),         LinkToFile=True,         SaveWithDocument=False,         Left=x, Top=y,         Width=width, Height=height)

In [9]:
pic('ppt3.png')

Out[9]:
[工具资源] Python 调用 COM 的 API 实现 Microsoft Office(PPT, Excel, Word) 办公自动化

Wouldn’t it be nice to have these hyperlinked to the movies?

要是海报能链接到每部电影就更帅了。

In [36]:
Base = Presentation.Slides.Add(1, 12)  width, height = 28, 41 for i, movie in enumerate(movies):     x = 10 + width * (i % 25)     y = 100 + height * (i // 25)     image = Base.Shapes.AddPicture(         os.path.abspath(filename(movie)),         LinkToFile=True,         SaveWithDocument=False,         Left=x, Top=y,         Width=width, Height=height)     url = urljoin('http://www.imdb.com/', movie.get('href'))     link = image.ActionSettings(ppMouseClick).Hyperlink     link.Address = url     link.ScreenTip = movie.text_content().encode('cp1252')

This is ordered by rank, which is useful, but this makes it hard to locate a specific movie. What if we could sort this alphabetically?

But then, we don’t want to lose the ordering by rank either. Could we, perhaps, get these movies to move on the click of a button to alphabetical or rank order?

Let’s start by adding two buttons — one to sort alphabetically and the ohter to sort by rank.

这是按照评分排序的,通常如此,但是很难定位到某一个电影。要是按字母排序呢?

但是,我们也不想丢失评分排序。那么,我们能不能加个按钮,让这些电影分别按字母/评分排序?

让我们加两个按钮吧——分别按字母/评分排序。

In [47]:
Base = Presentation.Slides.Add(1, 12)  # Add two buttons: alphabetical and by rating button_alpha = Base.Shapes.AddShape(5, 400, 10, 150, 40) button_alpha.TextFrame.TextRange.Text = 'Alphabetical'  button_rating = Base.Shapes.AddShape(5, 560, 10, 150, 40) button_rating.TextFrame.TextRange.Text = 'By rating'  # Get the index position when sorted alphabetically movies_alpha = sorted(movies, key=lambda v: v.text_content()) index_alpha = dict((movie.text_content(), i) for i, movie in enumerate(movies_alpha))

We’ll create a function that moves an image along a path when a trigger is clicked. This will be applied to each of the images.

我们建一个函数实现让按钮按下的时候,电影图片会沿着一个路径移动,然后将函数应用到每张图片。

In [54]:
def animate(seq, image, trigger, path, duration=1.5):     '''Move image along path when trigger is clicked'''     effect = seq.AddEffect(        Shape=image,        effectId=msoAnimEffectPathDown,        trigger=msoAnimTriggerOnShapeClick,     )     ani = effect.Behaviors.Add(msoAnimTypeMotion)     ani.MotionEffect.Path = path     effect.Timing.TriggerType = msoAnimTriggerWithPrevious     effect.Timing.TriggerShape = trigger     effect.Timing.Duration = duration

Finally, we draw all the images. After drawing them, we specify one animation for alphabetical ordering, and another for ordering by rating.

最后,我们将函数应用到所有图片上。之后,我们指定一个动画为按字母排序,另一个动画为按评分排序。

In [55]:
seq_alpha = Base.TimeLine.InteractiveSequences.Add() seq_rating = Base.TimeLine.InteractiveSequences.Add() width, height = 28, 41 for i, movie in enumerate(movies):     x = 10 + width * (i % 25)     y = 100 + height * (i // 25)     image = Base.Shapes.AddPicture(         os.path.abspath(filename(movie)),         LinkToFile=True,         SaveWithDocument=False,         Left=x, Top=y,         Width=width, Height=height)     url = urljoin('http://www.imdb.com/', movie.get('href'))     link = image.ActionSettings(ppMouseClick).Hyperlink     link.Address = url     link.ScreenTip = movie.text_content().encode('cp1252')          # Alphabetical      index = index_alpha[movie.text_content()]     animate(seq_alpha, image, trigger=button_alpha, path='M0,0 L{:.3f},{:.3f}'.format(         (10 + width * (index % 25) - x) / 720.,         (100 + height * (index // 25) - y) / 540.,     ))          # By rating     animate(seq_rating, image, trigger=button_rating, path='M{:.3f},{:.3f} L0,0'.format(         (10 + width * (index % 25) - x) / 720.,         (100 + height * (index // 25) - y) / 540.,     ))

In [10]:
pic('ppt4.png')

Out[10]:
[工具资源] Python 调用 COM 的 API 实现 Microsoft Office(PPT, Excel, Word) 办公自动化

Simple Treemap

Let’s do something that you’d have trouble doing manually in PowerPoint: a Treemap . The Guardian’s data store kindly makes available the top 50 banks by assets that we’ll use for this example. Our target output is a simple Treemap visualisation.

简单的矩形树状结构图¶

我们再用PPT做一个费事的 Treemap 。我们将使用 Guardian’s data store 里面的 top 50 banks by assets 数据来做一个简单的 矩形树状结构图

In [85]:
Application = win32com.client.Dispatch("PowerPoint.Application") Application.Visible = True Presentation = Application.Presentations.Add() Slide = Presentation.Slides.Add(1, ppLayoutBlank)

Now let’s import data from The Guardian. The spreadsheet is Google Spreadsheets and we can get just the banks and assets as a CSV file by adding "&output=csv&range=B2:C51" (via OUseful.Info ).

现在我们导入数据Guardian. 数据在 Google Spreadsheets ,我们可以通过在链接后面增加"&output=csv&range=B2:C51"获取数据 (参考 OUseful.Info ).

In [86]:
import urllib2, csv url = 'http://spreadsheets.google.com/pub?key=phNtm3LmDZEOoyu8eDzdSXw&output=csv&range=B2:C51' reader = csv.reader(urllib2.urlopen(url)) data = list((int(size.replace(',','')), bank.decode('utf8')) for bank, size in reader)

I created a simple Treemap class based on the squarified algorithm — you can play with the source code . This Treemap class can be fed the the data in the format we have, and a draw function. The draw function takes (x, y, width, height, data_item) as parameters, where data_item is a row in the data list that we pass to it.

我用 squarified algorithm 建了一个simple Treemap类,你可以在 源代码 看到。这个Treemap类可以把数据转换成指定的格式。

In [87]:
class Treemap:     def __init__(self, width, height, data, draw):         '''Treemap(width, height, data, fn) '''         self.x, self.y = 0.0, 0.0         self.scale  = (float(width * height) / self.get(data, sum)) ** 0.5         self.width  = float(width)  / self.scale         self.height = float(height) / self.scale         self.draw   = draw         self.squarify(data, [], min(self.width, self.height))      def get(self, data, fn): return fn((x[0] for x in data))      def layoutrow(self, row):         if self.width >= self.height:             dx = self.get(row, sum) / self.height             step = self.height / len(row)             for i,v in enumerate(row): self.draw(self.scale * self.x, self.scale * (self.y + i * step), self.scale * dx, self.scale * step, v)             self.x += dx             self.width -= dx         else:             dy = self.get(row, sum) / self.width             step = self.width / len(row)             for i,v in enumerate(row): self.draw(self.scale * (self.x + i * step), self.scale * self.y, self.scale * step, self.scale * dy, v)             self.y += dy             self.height -= dy      def aspect(self, row, w):         s = self.get(row, sum)         return max(w*w*self.get(row, max)/s/s, s*s/w/w/self.get(row, max))      def squarify(self, children, row, w):         if not children:             if row: self.layoutrow(row)             return         c = children[0]         if not row or self.aspect(row, w) > self.aspect(row + [c], w):             self.squarify(children[1:], row + [c], w)         else:             self.layoutrow(row)             self.squarify(children, [], min(self.height, self.width))

还包括一个画图函数。画图函数参数为(x, y, width, height, data_item),其中data_item是输入数据。

In [88]:
def draw(x, y, w, h, n):     shape = Slide.Shapes.AddShape(msoShapeRectangle, x, y, w, h)     shape.TextFrame.TextRange.Text = n[1] + ' (' + str(int(n[0]/1000 + 500)) + 'M)'     shape.TextFrame.MarginLeft = shape.TextFrame.MarginRight = 0     shape.TextFrame.TextRange.Font.Size = 12 Treemap(720, 540, data, draw)

Out[88]:
<__main__.Treemap instance at 0x08BFF148>

In [11]:
pic('ppt5.png')

Out[11]:
[工具资源] Python 调用 COM 的 API 实现 Microsoft Office(PPT, Excel, Word) 办公自动化

Real-life examples¶

You can see how these were put to use at Gramener :

企业案例¶

你可以在 Gramener 看到很多这样的东东:

Live tweeting¶

Just for kicks, let’s use PowerPoint as a dashboard to show live tweets.

I picked TwitterAPI to get streaming results, but twython and Python Twitter Tools look fine too.

Live tweeting¶

让我们用PPT做一个dashboard来显示tweets信息。良好的网络环境是前提。

我用 TwitterAPI 来获取信息, twythonPython Twitter Tools 当然也行。

In [60]:
from TwitterAPI import TwitterAPI  # I'm keeping my keys and secrets in a secret file. from secret_twitter import consumer_key, consumer_secret, access_token_key, access_token_secret api = TwitterAPI(consumer_key, consumer_secret, access_token_key, access_token_secret)

This function will draw a tweet in a reasonably nice way on a slide. There’s a block each for the profile picture, the text of the tweet, and the name of the user.

这个函数会让tweet在幻灯片中漂亮地呈现。包括配置图片,tweet内容,用户名。

In [64]:
def draw_tweet(Base, item, pos):     y = 40 + (pos % 4) * 120          image = Base.Shapes.AddPicture(         # To get the larger resolution image, just remove _normal from the URL         item['user']['profile_image_url'].replace('_normal', ''),         LinkToFile=True,         SaveWithDocument=False,         Left=20, Top=y,         Width=100, Height=100)          try:         status = item['text'].encode('cp1252')     except UnicodeEncodeError:         status = item['text']     text = Base.Shapes.AddShape(5, 130, y, 460, 100)     text.Fill.ForeColor.ObjectThemeColor = msoThemeColorText1     text.Fill.ForeColor.Brightness = +0.95     text.Line.Visible = msoFalse     text.TextFrame.TextRange.Text = status     text.TextFrame.TextRange.Font.Color.ObjectThemeColor = msoThemeColorText1     text.TextFrame.TextRange.ParagraphFormat.Alignment = ppAlignLeft          user = Base.Shapes.AddShape(msoShapeRectangle, 600, y, 100, 100)     user.Fill.ForeColor.ObjectThemeColor = msoThemeColorAccent6     user.Line.Visible = False     user.TextFrame.TextRange.Text = '@' + item['user']['screen_name']

Let’s track requests for specific words, and see what we get.

让我们来看看跟踪的结果。

In [ ]:
Base = Presentation.Slides.Add(1, ppLayoutBlank) api.request('statuses/filter', {'track': '1'})  for pos, item in enumerate(api.get_iterator()):     draw_tweet(Base, item, pos)     if pos > 10:         break

GAME OVER

转载本站任何文章请注明:转载至神刀安全网,谢谢神刀安全网 » [工具资源] Python 调用 COM 的 API 实现 Microsoft Office(PPT, Excel, Word) 办公自动化

分享到:更多 ()

评论 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址