
Sách gia công, Bìa mềm
Comparing to Perl, Python has a quite lagged
adoption as the scripting language of choice in the field of
bioinformatics, although it is getting some moment recently. If you read
job descriptions for bioinformatics engineer or scientist positions a
few year back, you barely saw Python mentioned, even as "nice to have
optional skill". One of the reasons is probably lacking of good
introductory level bioinformatics books in Python so there are, in
general, less people thinking Python as a good choice for
bioinformatics. The book "Beginning Perl for Bioinformatics" from O
Reilly was published in 2001. Almost one decade later, we finally get
the book "Bioinformatics Programming Using Python" from Mitchell Model
to fill the gap. When I first skimmed the book "Bioinformatics
Programming Using Python", I got the impression that this book was more
like "learning python using bioinformatics as examples" and felt a
little bit disappointed as I was hoping for more advanced content.
However, once I went through the book, reading the preface and
everything else chapter by chapter, I understood the main target
audiences that author had in mind and I thought the author did a great
job in fulfilling the main purpose. In modern biological research,
scientists can easily generate large amount of data where Excel
spreadsheets that most bench scientists use to process limiting amount
of data is no longer an option. I personally believe that the new
generation of biologists will have to learn how to process and manage
large amount inhomogeneous data to make new discovery out of it. This
requires general computational skill beyond just knowing how to use some
special purpose applications that some software vendor can provide. The
book gives good introduction about practical computational skills using
Python to process bioinformatics data. The book is very well organized
for a newbie who just wants to start to process the raw data their own
and get into a process of learning-by-doing to become a Python
programmer. The book starts with an introduction on the primitive data
types in Python and moves toward the flow controls and collection data
type with emphasis on, not surprisingly, string processing and file
parsing, two of most common tasks in bioinformatics. Then, the author
introduces the object-oriented programming in Python. I think a beginner
will also like those code templates for different patterns of data
processing task in Chapter 4. They summarize the usual flow structure
for common tasks very well. After giving the basic concept of
programming with Python, the author focuses on other utilities which are
very useful for day-to-day work for gathering, extracting, and
processing data from different data sources. For example, the author
discusses about how to explore and organize files with Python in the OS
level, using regular expression for extracting complicated text data
file, XML processing, web programming for fetching online biological
data and sharing data with a simple web server, and, of course, how to
program Python to interact with a database. The deep knowledge of all of
these topics might deserve their own books. The author does a good job
to cover all these topics in a concise way. This will help people to
know what can be done very easily with Python and, if they want, to
learn any of those topic more from other resources. The final touch of
the book is on structured graphics. This is very wise choice since the
destiny of most of bioinformatics data is very likely to be some graphs
used in presentations and for publishing. Again, there are many other
Python packages can help scientists to generate nice graph, but the
author focuses on one or two of them to show the readers how to do
general some graphs with them and the reader might be able to learn
something else from there. One thing I hope the author can also cover,
at least at a beginner level, is the numerical and statistical aspect in
bioinformatics computing with Python. For example, Numpy or Scipy are
very useful for processing large amount of data, generating statistics
and evaluating significance of the results. They are very useful
especially for processing large amount data where the native Python
objects are no longer efficient enough. The numerical computation aspect
in bioinformatics is basically lacking in the book. The other thing
that might be desirable for such a book is to show that Python is a
great tool for prototyping some algorithms in bioinformatics. This is
probably my own personal bias, but I do think it is nice to show some
basic bioinformatics algorithm implementations in python. This will help
the readers to understand a little bit more about some of the common
algorithms used in the field and to get a taste on a little bit more
advanced programming. Overall, I will not hesitate to recommend this
book to any one who will like to start to process biological data on
their own with Python. Moreover, it can actually serve as a good
introductory book to Python regardless the main focus on bioinformatics
examples. The book covers most day-to-day basic bioinformatics tasks and
shows Python is a great tool for those tasks. I think a little more
advanced topics, especially on basic numerical and statistical
computation in the book, will also help the target audiences.
Unfortunately, none of that topic is mentioned in the book. That has
been said, even if you are an experienced python programmer in
bioinformatics, the book's focus on Python 3 and a lot of useful
templates might serve well as a quick reference if you are looking for
something you do not have direct experience before.
Thể loại:Computers
Năm:2009
In lần thứ:1
Nhà xuát bản:O'Reilly Media
Ngôn ngữ:english
Trang:524