About

These were the UNIX shell programming assignments for the 2005/2006 fall semester. The original posting, in Hungarian, can be found here. Below is my English translation of the problems, along with the solutions I wrote. The original posting contained strict rules for the format of error messages, to help automated testing; these are not reproduced here. The test suite also enforced limits on the run-time, which is why you will find all of my solutions to be written in Awk, with just a thin crust of shell scripting to process command-line switches.

Mirroring
Rotation
Scaling
Columniation
Columniation redux
Table to List
List to Table
Letter Frequency
Word Frequency

1. Mirroring

Write a filter that mirrors its text input horizontally. Make sure the whole rectangular area of the input is mirrored, even when the input lines are not of the same length. Additionally, the lines of the output should contain no trailing spaces. Also implement the following switches:

-c

Replace "graphical" characters to their mirrored counterparts, according to this table:

Replace this:  ( ) < > \ / ` ' ] [ } {
With this:     ) ( > < / \ ' ` [ ] { }

-s

Strip leading spaces from the output lines

Download my script

2. Rotation

Write a filter that rotates its text input 90 degrees CCW, that is, the first line becomes the first column, and the last column becomes the first line. Exclude leading and trailing empty lines from the output (an empty line is one containing only whitespace characters). Also implement the following switches:

-c

Replace "graphical" characters to their rotated counterparts, according to this table:

Replace this:  \ / ` , ' _ - |  
With this:     / \ , ' , | | _

Download my script

3. Scaling

Write a filter that magnifies or shrinks its text input, based on the command line switch +mxn or -mxn. m and n are the horizontal and vertical scaling factors. A scaling of e.g. +3x2 produces from this input:

ABCD
1234

the following output:

AAABBBCCCDDD
AAABBBCCCDDD
111222333444
111222333444

A scaling of -mxn is to do the opposite of that, i.e. to output columns 1, m+1, 2m+1, ... from the lines 1, n+1, 2n+1, ... of the input.

Download my script

4. Columniation

Write a filter that arranges the text input into columns, in accordance with the following conditions:

The output must fit into 80 columns, unless the input contains longer lines. If that is the case, output only on column.
The columns should be continuous, i.e. if you put the columns of the output beneath each other from left to right, you get the original input
Every column is of the same width; that width is the length of the longest input line
Given n lines of input, arranged into c columns, each column should contain int((n+c-1)/c) lines, with the exception of the last column.
The output should consist of the least possible number of lines
Lines shorter than the column width are to be padded with spaces. Columns are separated by two spaces.
You are free to assume that the input doesn't contain special characters with an ASCII value below 0x20.

Note that for any given input, this specification is unambiguous, i.e. the combination of the above restrictions allow only one possible output.

Download my script

5. Columniation redux

Write a filter that arranges the text input into columns, in accordance with the following conditions (differences to assignment #4 are highlighted)

The output must fit into 80 columns, unless the input contains longer lines. If that is the case, output only on column.
The columns should be continuous, i.e. if you put the columns of the output beneath each other from left to right, you get the original input
The width of every column is determined by the width of the longest line in that column
Given n lines of input, arranged into c columns, each column should contain int((n+c-1)/c) lines, with the exception of the last column.
The output should consist of the least possible number of lines
Lines shorter than the column width are to be padded with spaces. Columns are separated by two spaces.
You are free to assume that the input doesn't contain special characters with an ASCII value below 0x20.

Note that for any given input, this specification is unambiguous, i.e. the combination of the above restrictions allow only one possible output. Also note that is very close to the way the GNU implementation of ls -C -w81 works.

Download my script

6. Table to List

Write a filter that converts a TAB-delimited table to a colon-delimited list, using the first row of the table as field names.

Example: given this input:

`header₁`	`<TAB>`	`header₂`	`<TAB>`	`...`	`<TAB>`	`header_m`
`data_1₁`	`<TAB>`	`data_1₂`	`<TAB>`	`...`	`<TAB>`	`data_{1_m}`
`data_2₁`	`<TAB>`	`data_2₂`	`<TAB>`	`...`	`<TAB>`	`data_{2_m}`
`...`
`data_n₁`	`<TAB>`	`data_n₂`	`<TAB>`	`...`	`<TAB>`	`data_{n_m}`

produce the following output:

header

You are free to assume that the input is well-formed and contains at least two lines.

Download my script

7. List to Table

Write a filter that implements the exact reverse of the operation described in assignment #6 above. You are free to assume that the input is well-formed and contains at least one line, and field names contain no colon characters.

Download my script

8. Letter Frequency

Given n files as command line arguments, calculate the frequency of letters for each file, and display the results in a table. Letters are defined to be members of the English alphabet [a-zA-Z] and the set of Hungarian accented letters áÁéÉíÍóÓöÖőŐúÚüÜűŰ.

The output should be a multi-column list, the first column being the list of lowercase letters encountered in any of the input files (sorted according to the C locale), and subsequent columns containing the number of occurances of that letter in file₁ ... file_n, separated by spaces. Example output for 2 files:

Note how letters not encountered in any of the input files are not listed, i.e. there is no row containing only zeros.

Download my script

9. Word Frequency

Given n files as command line arguments, calculate the frequency of words for each file, and display the results in a table. Words are defined to be one or more continuous list of letters (see the definition of letters above). Every non-letter character is to be considered whitespace.

The output should be a multi-column list, the first column being the list of words encountered (in lowercase) in any of the input files (sorted according to the C locale), and subsequent columns containing the number of occurrences of that word in file₁ ... file_n, separated by spaces. Example output for 2 files:

a 5 8
the 6 3
word 2 0
she 3 5

Download my script