Feed View | Planet Python

Planet Python

Planet Python - http://planet.python.org/

Subscribe | Retrun to feeds | Users subscribed: 1 | Last Updated: Nov 20 2008, 07:15:58

Mike C. Fletcher: No thrill on Ruby yet

20 Nov 2008 04:20:35
Just wrote my first Ruby script (I've only read about it before this).  I'm not falling in love yet... can't say I'm really even finding it beautiful.  All those "ends" scattered around, scoping with line-noise.  I see the niceness of the block abstraction, but I haven't yet come to need them.  Ah well, suppose I should plough through a lot more before I start really evaluating it.

James Tauber: Post Length By Month

20 Nov 2008 01:33:56

I made the comment in my Half Time Report that I think I've managed to stick to "posts no shorter than normal and no longer".

Alas the data don't match that.

Here's a graph of average post length (just len(page.content.split())) by month:

Clearly the posts this month are longer on average. And because I'm writing more of them, the last point of this next graph is no surprise.

This is a graph with post lengths totalled for each month:

And obviously this month has a fair bit to go. What's more interesting though is my first year of blogging saw growth in total words posted per month grow to a peak around 7500 in December 2004 and then a clear trend down to the trough in August 2006 when I posted just 3 short posts the whole month.

These graphs were done with IBM's Many Eyes simply because Apple Numbers fails miserably at charting anything but the smallest data sets.


IronPython-URLs: Silverlight 2 and Dynamic Languages

20 Nov 2008 00:03:37
Well my backlog is logged so far back that I haven't even blogged about the final release of Silverlight 2.

In case you've been living under a rock for the last year, Silverlight 2 is a browser plugin from Microsoft. It is similar to Flash (aimed at games, media streaming and rich internet applications) and is cross-platform (Mac OS X and Windows - the officially blessed Linux port Moonlight by the Mono guys is making good progress though) and cross browser (IE 7+, Safari & Firefox 2+). Unlike Flash it can be programmed with a choice of languages, and through the Dynamic Language Runtime it can be programmed in Python, Ruby and Javascript.

Silverlight 2 final is now out, and according to Scott Guthrie has now been installed on over 100 million consumer computers.

Naturally Jimmy Schementi, who maintains the Dynamic Language Support for Silverlight, released an updated version of the Silverlight Dynamic Languages SDK (sucky name - more on this in a bit):
In preparation for my talk at the PyWorks conference I updated my IronPython Web IDE (tool for experimenting with the Silverlight APIs from Python in the browser) and Try Python (interactive Python interpreter in the browser) for Silverlight 2:
Of course if you're interested in building internet applications with Silverlight then you will want the extended controls that come as part of the Visual Studio Tools for Silverlight 2 (this works with Visual Studio Professional or the free Visual Web Developer Express). The assemblies that come with the tools include the data grid, extra controls like the date picker and various other useful APIs.

Even better is the Silverlight Toolkit. This is by Microsoft, but Open Source (living on Codeplex and with Unit Tests). As it is a separate project it can have a separate release cycle, including experimental components and being updated more frequently than Silverlight itself.

The toolkit includes charting components plus new controls covering styling, layout, and user input.

Since these releases Jimmy Schementi has been far from idle. His latest blog entries track what he has been up to:

Though, for certain scenarios, running scripts in a VB/C# application would be useful. For example, a shopping application that has a bunch of business rules, like 'when someone has three items in their cart that all have to do with cooking, give them 10% off.' These type of rules can change all the time, and traditionally you'd either store the rules in a database and implement a engine to understand the rules, or hand-code them yourself and have to redeploy the system every time you want to change them. Or, you could save yourself the hassle and store the rules as Python or Ruby code, and then host the DLR in your application to run the code. Want to update the rules? Just update the code, nothing more.
Embedding an IronRuby REPL (interactive interpreter) in a Silverlight application.
This is the good one! Jimmy posted this email to the IronRuby and IronPython mailing lists:

First and foremost, I want to thank anyone who has used the bits on http://codeplex.com/sdlsdk, and accepting my bullshit version of open-source. While getting monthly binaries/sources is nice, it should be about working on the project together ... not just me throwing stuff over the wall to you. That's changing, now ...

Oh, and remember me complaining about the crappy 'sdlsdk' name ... well, I'm trying to get rid of that acronym ...

http://github.com/jschementi/agdlr

Above the public repository for the DLR integration in Silverlight. The following post explains what's in there, what's not, what's git, and how to contribute: http://blog.jimmy.schementi.com/2008/11/agdlr-silverlight-dlr-open-source.html

My first order of collaboration is this simple new feature, 'console=true': http://blog.jimmy.schementi.com/2008/11/repls-in-silverlight.html. If you like this, please feel free to look at what's been done, and if you want to fix something that doesn't yet work correctly, I won't stop you.

Also, as I mentioned in a previous mail, I want to make the filesystem->XAP/isolatedstorage metaphor stronger, so feel free to experiment with that as well. Over the next week I'll get some website-presence/wiki/etc, and we can run this project up and running. There are still some hurdles I need to clear with getting contributed code back into our internal codebase, and shipping on Codeplex, but there are no problems with keeping things on GitHub for now.

Let me know if there are any question. I know I've been a bit silent on the Silverlight front, but take this as me making it up to you.

The resulting discussion also revealed where the Silverlight development tool Chiron got its name from:

Yep, Ag is Silver … made pretty obvious by my little logo for it.

As far as Chiron. It's a planetoid between Saturn and Uranus. The port that Chiron.exe runs on by default, 2060, is Chiron's 'object' number. It was derived from the Cassini ASP.NET Web server that Dmitry Robsman wrote. Cassini was a probe mission to explore the moons of Saturn, and Chiron was initially thought to be a moon of Saturn.

Plus this from Michael Letterle:

More importantly, it's also the name of one of Jonathan Coulton's songs: 'Chiron Beta Prime'.

Because of this I knew how to pronounce the name :)


James Tauber: The Long and Short of Mathematics

19 Nov 2008 23:01:33

I've previously talked about Oxford's Very Short Introduction series. My first introduction to it (via a recommendation from Greg Mankiw) was Timothy Gowers' Mathematics: A Very Short Introduction which is the best little (160 page) book I've ever read on what mathematics is really about.

A few weeks ago, I bought The Princeton Companion to Mathematics which weighs in at 1008 pages. It's sweeping vista of pure mathematics, and probably the best big book I've ever read on pure mathematics in general. It provides survey articles on many different areas within pure mathematics from both a conceptual and historical viewpoint. I would say most of the book requires some college-level background in mathematics and some sections would best suit graduate students (although to give them breadth rather than depth) but it's the kind of book that you can dive in to at any point and learn something.

So it's interested that the editor of the PCM is the same Timothy Gowers that wrote the Oxford Very Short Introduction.

Well done, Professor Gower. You have succeeded in producing what I think are the best small and large single volume books on mathematics.

Just like in Greek Lexicography we have the "Little Liddell", "Middle Liddell" and "Big Liddell" (referring to the abridged, intermediate and full versions of Liddell and Scott's A Greek-English Lexicon) I think these books should be known as "Little Gowers" and "Big Gowers" :-)


Greg Wilson: If a Computer Has Touched You Inappropriately…

19 Nov 2008 22:21:28

The Computer Science department at the University of Toronto is featured in a recent episode of “Byte Club”.  I can hear the hippos whispering…


Peter Bengtsson: domstripper - A lxml.html test project

19 Nov 2008 22:00:00

I'm just playing with the impressive lxml.html package. It makes it possible to easily work with HTML trees and manipulate them.

I had this crazy idea of a 'DOM stripper' that removes all but specified elements from an HTML file. For example you want to keep the contents of the <head> tag intact but you just want to keep the <div id='content'>...</div> tag thus omitting <div id='banner'>...</div> and <div id='nav'>...</div>. domstripper now does that. This can be used for example as a naive proxy that tranforms a bloated HTML page into a more stripped down smaller version suitable for say mobile web browsers. It's more a proof of concept that anything else.

To test you just need a virtual python environment and the right system libs to needed to install lxml. This worked for me:

 $ sudo apt-get install cython libxslt1-dev
 $ cd /tmp
 $ virtualenv --no-site-packages testenv
 $ cd testenv
 $ source bin/activate
 $ easy_install domstripper

Now you can use it like this:

 >>> from domstripper import domstripper
 >>> help(domstripper)
 ...
 >>> domstripper('bloat.html', ['#content', 'h1.header'])
 <!DOCTYPE...
 ...

Best to just play with it and see if makes sense. I'm not saying this is an amazing package but it goes to show what can be done with lxml.html and the extremely user friendly CSS selectors.


Beginning Python for Bioinformatics: Creating an interface for the motif finding script, final

19 Nov 2008 21:57:24

We can say that this would be our final version of the script. There are many nice wxPython programming resources, and one is a very good book called wxPython in Action, which is co-written by Robin Dunn, the wxPython maintainer. Go check it out.

So for the last entry in this series, we just need to add a couple of changes to our interface and motif finding scripts. Basically on the interface script we need to add a line that gets the value entered (or the default one, if not changed) in the motif width input box. And we can do that by including the line below in the run_finder function.

 width = self.motif_width.GetValue() 

This line tells the script to get the value of the box and assign to the variable width. This method will get whatever is inside the input box and save as a string to the variable assigned. Now, we need to create the structure to actually send this value to the motif finder functions. Last version of our function calculate_motifs received two parameters, we need to add an extra one, and also change the lines that call the function that get the quorums. Basically the first lines of the function will be

 def calculate_motifs(input_seqs, input_seqs2, width):      print input_seqs, input_seqs2     input_seqs = fasta.read_seqs(open(input_seqs).readlines())     input_seqs2 = fasta.read_seqs(open(input_seqs2).readlines())      foreground = get_quorums(input_seqs, width)     background = get_quorums(input_seqs2, width) 

And that’s it. Our simple interface is ready to primetime. OK, not prime primetime, we didn’t add a series of features that will make it useful by everyone. For instance, there is no error control, so someone could enter ‘ABC’ in the width input box and that value would be sent and an error will occur. Also you can click the run button without any file selected. And we could go on and on. But this is just a primer, and we can build from it.

The code is on Github, so get it there and have fun. Next time we will see … no plans yet. We’ll see …

Technorati Tags: , , ,


Greg Wilson: Watch This Space

19 Nov 2008 17:38:34

Basie Logo


Ted Leung: Python in NetBeans

19 Nov 2008 16:56:00

Along with today’s launch of NetBeans 6.5, Sun, in cooperation with the NBPython community, are releasing an early access version of Python support for NetBeans. This is a result of the collaboration between Sun people and the NBPython project that I wrote about back in July. This release has been tested by folks in the NetBeans community and some folks from Sun’s NetBeans QA team, and it’s in pretty good shape for an early access release. We’re interested in getting people’s feedback. We would also love to see more people get involved with NBPython.

How to get it?

You can get NetBeans Python from the NetBeans download page.

What’s in it?

The basic feature set for the early access release consists of an editor for Python, the ability to execute Python programs (using CPython or Jython), and a debugger.

There’s a tutorial up on the NetBeans wiki.

Tor Norbye, who did most of the work on the editor, has written a series of blog posts detailing various features of the Python editor.

Who did it

Allan Davis - project and platform management, interactive console.

Jean-Yves Mengant - Jean-Yves is the author of the jpydbg debugger, which he’s merged into NBPython.

Amit Saha - documentation and help sets - Amit works for Sun, but he’s doing Python on his own time.

Tor Norbye (Sun) - editing.

Tomas Zezula (Sun) - project and platform management.

Ted Leung (me) (Sun) - various behind the scenes stuff.

Frank Wierzbicki (Sun) - NBPython is using Jython’s parser and Frank worked with Tor to add support for positions and better error reporting.

Peter Lam (Sun) - Sun QA

Tony Beckham (Sun) - Sun QA

The NetBeans CAT community as well as those folks who drove by and reported bugs.

How to get involved

NBPython has become a full fledged NetBeans project, so the main project page is now on NetBeans.org, as are the issue tracker and mailing lists:

nbpython-dev@netbeans.org
nbpython-issues@netbeans.org
nbpython-commits@netbeans.org
nbpython@netbeans.org


Mike C. Fletcher: Now we need 3 million dollars...

19 Nov 2008 16:14:08
I chose 1 million dollars as a small number that should be possible to raise if we have a good enough proposal.  The top three voted projects focus on, one way or another, performance and applicability.  The winning project essentially was to improve Python's concurrency storing markedly.  Basically spend the money on hiring people to really pound on multi-processing and maybe even fine-grained locking (concurrent threading) in CPython to produce far better coarse-grained concurrency support.  We'd have to see who would be interested in funding this type of research... we saw concurrency as a problem with the current CPython implementation (though there was the hint that it's a problem of perception, rather than necessarily a practical problem).

The pan-mobile SDK project was more of an immediate-term goal, but it requires significant compiler technology and a lot of hard work in order to produce a commercial-quality resulting platform/SDK.  There was some question as to whether $1 million would be enough to produce the requisite quality, but as it's a commercial project, the $1 million could be seed money and you might be able to draw handset manufacturers in to further funding.  The project seems very doable, it's just a question of money, time, vision and effort.

The idea of fine-grained concurrency (e.g. spreading for loops across processors automatically if you can prove there's no dependencies in the loop code) was seen more as a PyPy project, as was the addition of explicit asynchronous operation syntax. We spent a while discussing what would be needed for each project, and the idea that PyPy would make the mobile-targeting SDK easier, and that concurrency/performance issues would be best worked out in PyPy came up a number of times. Effectively, while we had proposals that said "Python should be", we didn't get a lot of support for the idea of using CPython as the platform for figuring out how to do those things beyond minor changes to the current approaches.

PyPy development was, by our rough estimates, 10 person-years from being a robust platform for widespread real-world development, that's cutting it close for a mere 1 million dollars, but maybe if we partnered with other people who would have a stake in the result.  Performance improvements, concurrency primitives and back-ends targeting mobile-type platforms were all identified as areas that would make sense to focus energy.  Realistically, though, the core PyPy tool chain probably needs to be the first focus and then we can look at the extras.


Malthe Borch: Speedup

19 Nov 2008 14:03:00

You can now have your cake and eat it, too.

Plone benchmark comparing Chameleon to ZPT

Disclaimer: Don't try this at home! Or actually, do try it, but only on Plone trunk. Simply pull in the five.pt egg and load its configuration. It's a whole-sale drop-in replacement of Zope Page Templates.

If you're interesting in sponsoring this on-going effort, there's an excellent sponsorship opportunity for the upcoming performance-sprint in Bristol. Please contact me by e-mail at mborch@gmail.com.

Ned Batchelder's blog: Pathological backtracking

19 Nov 2008 13:14:18

At work we've been using the well-regarded feedparser module to parse RSS feeds, and it works great for the most part, but we'd occasionally get a stuck server process. The CPU would spike to 100%, and wouldn't make any progress.

We discovered a particular feed would cause a particular regular expression in the code to spin endlessly. The regex was intended to determine if a style attribute is valid CSS:

if not re.match('^(\s*[-\w]+\s*:\s*[^:;]*(;|$))*$', style):
    return ''

Breaking this out into verbose regex syntax shows how it matches valid CSS:

'''(?x)             # use verbose regex syntax
    ^(
                    # A single CSS clause is:
    \s*             #   leading whitespace
    [-\w]+          #   a dash-word, the property name
    \s*:\s*         #   space, colon, space
    [^:;]*          #   anything but :;, the value
    (;|$)           #   ends with a semi or the end of the string
    
    )*              # Valid CSS is any number of clauses
    $
'''

And here's the snippet discovered in the feed that spun us hard (with whitespace added for readability):

<var style='COLOR: #fffafe; coming: ; basket: ; philologist: ; gradually: ;
encyclic: ; whitechapel: ; left: ; albino: ; lamelliform: ; foment: ;
adjuvant: ; Room:  ; Milk:  ; buynow: ; wheelwork: ; unseal: ; reasons: ;
socalled: ; dazed: ; Brain:  ; Kaleidoscope:  ; hardheaded: ; asthenic: ;
preferred: ;  Barbecue:  ; Comet:  ; Nail:  ; lubberly: ; School:  ;
Mist:  ; undercurrent: ; intwine: ; isotonic: ; Chief:  ; miscellaneous: ;
Book:  ; Shoes:  ; Chocolates:  ; deuced: ; you: ; Man:  ; federalize: ;
Rainbow:  ; Satellite:  ; Printer:  ; amicus: ; tautophony: ; taking: ;
regrater: ; waggon: ; prescient: ; God:  ; prosing: ; Bank:  ; hariolation: ;
patriarchs: ; Pyramid:  ; Data Base:  ; PaintBrush:  ; ingenu: ; Rope:  ;
parenchyma: ; price: ; Alphabet:  ; Circle:  ; seeks: ; frankhearted: ;
vituperate: ; dysmeromorph: ; Shop:  ; firm: ;  imperforation: ; lane: ;
Gemstone:  ; slatternly: ; Fire:  ; impudence: ; Carrot:  ; Fan:  ;
inoccupation: ; uncover: ; Liquid:  ; drawee: ; Pocket:  ;barbacan: ;
fornicatress: ; chimes: ; Crystal:  ;innovation: ; years: ; untiring: ;
Freeway:  ;desertful: ; unreined: ; Compass:  ; Hose:  ;prelusive: ;
impenetrability: ; Fruit:  ; direct: ; '></var>

(yes, it's garbage, and yes, spam sucks.)

It's hard to see the problem here, but this is not valid CSS because they used 'Data Base' as a property name about half-way through and spaces aren't allowed in property names.

The CPU spins because when the regex encounters the failure to match 'Data Base', it backtracks to reconsider previous matches in the hopes that it can still make the regex work. In fact, it isn't in an infinite loop, just a very very very long one. Eventually this regex will finish and decide that the string doesn't match.

But we don't need it to backtrack: going back to re-match previous CSS clauses isn't going to help.

Some regex libraries offer solutions to this problem. Possessive quantifiers let you use *+ to mean, match as many as possible, and once matched, don't try matching fewer during backtracking. They're called possessive because once the operator claims part of the string, it won't give it back for other operators to match later.

But Python doesn't offer possessive quantifiers (yet yet). So we have to choose a different technique than trying to match the whole string in one large regex. In this case, since we don't need the match data, we're just checking that the whole string matches, so we can use re.sub to remove matching clauses and then check that there's nothing left over:

if re.sub('\s*[-\w]+\s*:\s*[^:;]*;?\s*', '', style):
    return ''

Because re.sub grabs matches, performs the replacement, and moves on, there's no needless backtracking to throw a wrench in the works. Now our crazy CSS spam is speedily dispatched as invalid.

As an interesting side effect, if the string is not empty, what remains is the invalid part of the string.


Christopher Lenz: The Truth About Unicode In Python

19 Nov 2008 09:05:44

The unicode support in Python is generally considered to be pretty good. And in comparison to many other languages, it's good indeed.

But compared to what is provided by the International Components for Unicode (ICU) project, there's also a lot missing, including collation, special case conversions, regular expressions, text segmentation, and bidirectional text handling. Not to mention extensive support for locale-specific formatting of dates and numbers and time calculations with different calendars.

Basically what Python does provide out of the box is “only” encoding/decoding, normalization, and some other bits such as simple case conversion and splitting on whitespace. It's the absolute minimum you need to do anything useful with unicode, but often not enough to build truly internationalized applications. (Fortunately, most applications get away without true internationalization.)

In this post, I'm going to talk about a couple of the problems with unicode in Python. Please note that this is not intended as a criticism of Python's unicode support or the people who designed and implemented it. Most of those people probably know a whole lot more about unicode than I do, and the limitations discussed here are the result of a pragmatic approach to implementing unicode support, rather than due to a lack of knowledge.

read on …


Mike C. Fletcher: Million Dollar Ideas Wrap-up from PyGTA

19 Nov 2008 05:44:02

Great fun at PyGTA this evening.  We had an overly complex voting
scheme, basically you could vote from 0 to 5 (5 being the highest) for
each project, total count wins.  We had quite a few proposals, and
laughed rather a lot during the process.

My take-aways:

  • Concurrency matters in multi-core systems, Python needs work here
  • Performance/scaling matters, we want to take Python to smaller hardware and mobile devices particularly
  • PyPy is seen as the vehicle for radical improvements to the language, not CPython
  • We want to make public services better
  • We want to help educate people
  • We'd like to stop spam
  • Disinterest in Python 3000 is strong, but not universal, and seems to be weakening

We also discussed the sorry state of the PyGTA web-site.  We really need to get it updated.  I'm thinking a simple CMS-type site, with:

  • a calendar
  • a trace of the announcements (a blog...)
  • the ability for a few people to post notices/topics (i.e. I shouldn't be the single point of failure)
  • maybe an RSS feed
  • pointers to
    the mailing-lists

The Wiki features just don't get used enough to bother with them it seems.  I've got the pygta.org domain registered with Vex, so we can redirect it wherever we want once we know where we want to direct it.

Greg Wilson: Need Some Help?

19 Nov 2008 02:37:48

Next term, I’m teaching a Computer Science course at the University of Toronto in which graduate and undergraduate students will do some consulting and/or development work for real-world clients. The students have backgrounds in areas as diverse as network security, user interface design, machine learning, graph theory, and numerical analysis, so pretty much anything is possible — the end-of-term flyer from last April will give you an idea of what they can do.

Here are the details:

  • Students can’t get a grade for work they’re being paid to do, so it has to be pro bono.
  • Clients in downtown Toronto are preferred (makes face-to-face meetings easier), but we’ve worked successfully with remote clients and open source groups before.
  • Pure coding projects are OK for undergrads, but grad student projects have to require some novel thinking as well (and that’s preferred for undergrad projects too).
  • They have to be able to talk about their project in public, and use whatever code they develop after the project is over. This doesn’t necessarily mean that projects have to be open source, but that definitely makes things simpler. (In the past, for example, students have sometimes had access to sensitive data that they couldn’t share with others, but were allowed to talk about the algorithms they were using and the patterns they were finding—that sort of thing is doable.)

So, could one or two of these students do something useful for you? If so, please let me know.


James Tauber: Discrete Cosine Transforms Part 1

19 Nov 2008 00:30:11

I've often been intrigued by the lossy part of JPEG compression so I thought I'd explore Discrete Cosine Transforms and their use in JPEG as a short multi-part blog series.

In this part, lets just talk about what a "discrete cosine" function is and then in the next couple of posts look at how the concept can be combined with basic linear algebra to break up images into components in such a way that you can throw out some components with minimal effect on the perceived image. It's quite clever but the mathematics is fairly straightforward.

Let's start with a single cycle of a cosine function shifted up and scaled so its values range from 0-255 instead of from -1 to +1. To make it discrete, we'll divide it up into four and simply take the value of the scaled cosine function at the midpoint of each of our four sections:

Here our four discrete values are 217, 39, 39, 217 .

Now let's do the same for one and a half cycles:

which yields values of 176, 11, 245 and 80.

Now half a cycle:

which yields 245, 176, 80 and 11 and zero cycles:

which gives us 255, 255, 255, 255

We've basically calculated values for 0 thru 3 half cycles. If we wanted to split the cosine into N pieces instead of 4, we'd calculate values for 0 thru N-1 half cycles.

But, in summary, for N = 4:

  • k = 3: 176, 11, 245, 80
  • k = 2: 217, 39, 39, 217
  • k = 1: 245, 176, 80, 11
  • k = 0: 255, 255, 255, 255

Ned Batchelder's blog: Victoria Marcus Olds, 1911–2008

19 Nov 2008 00:14:08

My grandmother died this morning. She was 97 years old, a good long life. She lived in New York City most of my life, but lived in a nursing home near me in Boston for the last year, so I saw her a few times, and those visits taught me some things about her.


Brian Jones: MySQL Problem and Solution Posts: r0ck.

18 Nov 2008 21:30:50

Taming MySQL is… challenging. Especially in very large, fast-growth, ‘always-on’ environments. It’s one of those things where you seemingly can never know all there is to know about it. That’s why I really like coming across posts like this one from FreshBooks that describes a very real problem that was affecting their users, how they dealt with it, why *that* failed, and what the final fix was. Post a link to your favorite MySQL Problem and Solution post in the comments (oh yeah, and “subscribe to comments” should be working now!)


Sean McGrath: Open Source Considered Harmful

18 Nov 2008 20:02:00
    'It is hot in this kitchen and getting hotter all the time. What to do? Find a way to reduce the heat or leverage the latest asbestos suits? We are mostly doing the latter. -- Open Source Considered Harmful

Beginning Python for Bioinformatics: Creating an interface for the motif finding script, some corrections

18 Nov 2008 19:45:31

We need to pause a bit and do some corrections on our code. First the code I posted on the last entry for the pymotif.py module is wrong. Ok, not wrong, but some of the code I use to test ended up on the blog. Ths first two lines of the calculate_motifs function contained a link to the files I use for testing and should be replaced by

 input_seqs = fasta.read_seqs(open(input_seqs).readlines()) input_seqs2 = fasta.read_seqs(open(input_seqs2).readlines()) 

Also both variables that store the filenames and paths in pymoteGUI.py are declared in the wrong scope. The should have be declared at the pymotGUI class level, so it is accessible to all the functions in that class. This also means that every time we access the variable it should be preceded by the class name in order for the interpreter to know where the to get the value from. So both corrected files would be

 #!/usr/bin/env python  import wx import pymot import pymotif import fasta import os  class pymot(wx.App):      def __init__(self, redirect=False):         wx.App.__init__(self, redirect)  class pymotGUI(wx.Frame):      fore_file = ''     back_file = ''      def __init__(self, parent, id):         wx.Frame.__init__(self, parent, id,  'Python Motif Finder', style=wx.DEFAULT_FRAME_STYLE)         self.__do_layout()      def __do_layout(self):          #adding the panel         panel = wx.Panel(self)          #defines the menubar         menubar = wx.MenuBar()          #file menu         filemenu = wx.Menu()         foreground_menu = filemenu.Append(-1, 'Select foreground file')         background_menu = filemenu.Append(-1, 'Select background file')         sep = filemenu.AppendSeparator()         quitmenu = filemenu.Append(-1, 'Quit')          #appends the menu to the menubar and creates it         menubar.Append(filemenu, 'File')         self.SetMenuBar(menubar)          #input box for motif width, and label         self.one_label = wx.StaticText(panel, -1, 'Motif width', (10,50))         self.motif_width = wx.TextCtrl(panel, -1, '10', (95, 50), (40,18))         #result textbox         self.results = wx.TextCtrl(panel, -1, '', (150, 50), (200, 100), wx.TE_MULTILINE | wx.TE_AUTO_SCROLL | wx.HSCROLL)          #run bbutton         self.run_button = wx.Button(panel, -1, 'Run', (10, 80))          #labels         self.fore_label = wx.StaticText(panel, -1, 'Select the foreground file', (10, 10))         self.back_label = wx.StaticText(panel, -1, 'Select the background file', (10, 30))          #binding the menus to functions         self.Bind(wx.EVT_MENU, self.on_foreground, foreground_menu)         self.Bind(wx.EVT_MENU, self.on_background, background_menu)         self.Bind(wx.EVT_BUTTON, self.run_finder, self.run_button)      def on_foreground(self, event):         dialog = wx.FileDialog(self, style=wx.OPEN)         if dialog.ShowModal() == wx.ID_OK:             pymotGUI.fore_file = dialog.GetPath()             self.fore_label.SetLabel(pymotGUI.fore_file)      def on_background(self, event):         dialog = wx.FileDialog(self, style=wx.OPEN)         if dialog.ShowModal() == wx.ID_OK:             pymotGUI.back_file = dialog.GetPath()             self.back_label.SetLabel(pymotGUI.back_file)      def run_finder(self, event):         print pymotGUI.fore_file         result = pymotif.calculate_motifs(pymotGUI.fore_file, pymotGUI.back_file)         for motif in result:             self.results.WriteText(motif + '\n')         #wx.MessageBox('It should run, eh?')  #if __name__ == '__main__': app = pymot() frame = pymotGUI(parent=None, id = -1) #frame.CentreOnScreen() frame.Show() app.MainLoop() 

and

 #!/usr/bin/env python  import fasta import sys from collections import defaultdict  def choose(n, k):     if 0 <= k <= n:         ntok = 1         ktok = 1         for t in xrange(1, min(k, n - k) + 1):             ntok *= n             ktok *= t             n -= 1         return ntok // ktok     else:         return 0  def get_quorums(seqs, mlen):     """     add seq id_no to a set     use explicit counter to create seq_no     """     quorum = defaultdict(int)     for seq in seqs:         for n in range(len(seq) - mlen):             quorum[seq[n:n + mlen]] += 1     return quorum  def calculate_motifs(input_seqs, input_seqs2):      print input_seqs, input_seqs2     input_seqs = fasta.read_seqs(open(input_seqs).readlines())     input_seqs2 = fasta.read_seqs(open(input_seqs2).readlines())      foreground = get_quorums(input_seqs, 10)     background = get_quorums(input_seqs2, 10)      N = len(input_seqs) + len(input_seqs2)      res_motifs = []     for i in foreground:         term1 = choose(background[i], foreground[i])         term2 = choose((N - background[i]), len(input_seqs) - 1)         term3 = choose(N, len(input_seqs))         p = (float(term1) * float(term2)) / term3         if 0 < p <= 0.0001:             res_motifs.append(i + '\t' + str(foreground[i]) + '\t' + str(background[i]) + '\t' + str(p))      res_motifs.sort()     return res_motifs 

On the next post, the last in the series, we will just check how to get the value from the width input box and wrap-up everything.

Technorati Tags: , , ,




Subscribe | Retrun to feeds | Users subscribed: 1 | Last Updated: Nov 20 2008, 07:15:58To top



 



Sign in to NewsAlloy
E-mail 
Password 
  Remember me 



News Alloy © Copyright 2005 - 2008 Mobispine AB. All Rights Reserved.