Good, brief, specific tutorials on Unix commands and PERL
Missing case study examples to put it all together to solve common problems
This book will help you if you want to learn to program using Unix commands and the PERL programming language, the portions that can be used for data - reformatting, evaluating, summarizing, etc. It offers reasonably good tutorials for these two subjects.
But the back cover of the book says this: "Your research has generated gigabytes of data and now you need to analyze it. You hate using spreadsheets, but it's all you know, so what else can you do? This book will transform how you work with large and complex data sets, teaching you powerful programming tools for slicing and dicing data to suit your needs."
The problem is, the book does not accomplish those goals. It may be possible to do some kinds of spreadsheet analysis using these tools, but this book does not give you examples. They do introduce two biology-oriented data formats, but example manipulations of these is fairly trivial and oriented to demonstrating one command. They do talk about SLICING (finding, matching, and extracting) data, but there is not much on DICING (rearranging, summarizing, combining) data.
Essentially, without examples, you'll be left to develop your own strategy to pull it together. To show you what I mean, consider the structure of the book:
Chapter 1 - Introduction - 8 pages
Chapter 2 - Installing Unix and Perl - 5 pages - does not really discuss installation beyond giving Windows users some options because Windows does not include Unix
Chapter 3 - An Introduction to Unix - 85 pages - introduces a subset of Unix *commands* (not Unix itself, which is an operating system) useful for working with data
Chapter 4 - An Introduction to Perl - 149 pages - introducing basics of Perl programming language
Chapter 5 - Advanced Unix - 34 pages. This chapter includes the only real discussion of working with data files in the sense the back cover is discussing. The chapter includes a few pages describing to biology data file formats, and uses them as examples for some command features.
Chapter 6 - Advanced Perl - 47 pages. Describes capabilities (such as multi-dimensional arrays) that are useful for data analysis, such as a brief look at reading a comma-separated file into an array.
Chapter 7 - Programming Topics - 48 pages - tips on the practice of programming in general.
The only meaningful discussion of analyzing data is in Chapters 5 and 6, and it is superficial, brief, and serves to illustrate how one command works, rather than strategies for using multiple capabilities together to solve a problem.
If I were to write a book with the same goals, I would structure it around typical problems to be solved, and introduce concepts as I go along. Sure, you need to have brief chapters on Unix and Perl basics, but there needs to be a LOT more focus on applying the technology to data analysis problems.
Even with the organization they used, the book would be more useful with several chapters discussing problems and solutions using Unix and Perl from what was presented - putting together what you have learned for practical purposes. The size of these missing chapters should be 75-200% of the introductory material that precedes it.
As a side comment, the Unix operating system is NOT required for this work. For Windows users, there are many sources of Unix work-alike commands and true Perl that will suffice for this book. They do mention one (Cygwin), but there are also commercial ones (MKS) and many other open-source or freeware alternatives that are not discussed.
Three stars on Amazon means "It's ok." If the book were positioned as a programming tutorial, I'd give it four stars, but it fails to address the problem statement written on the back of the book - manipulations of the sort you could do with a spreadsheet, including slicing and dicing - and is likely to disappoint someone who is looking for that focus.