CSC 372, Spring 2023 Assignment 2 Due: Friday, January 27, 2023 at 23:59:59

Introduction

Aside from one problem, bench.txt, there's no programming at all on this assignment. It's a combination of web research, creative thought, pondering, and a little writing.

For each of the problems whose name ends in .txt (like morefacts.txt and quote.txt) you are to answer by turning in a plain ASCII text file. Use an editor like Sublime, Vim, Emacs, Notepad++, etc. DO NOT submit Word documents, PDFs, Rich Text files, HTML documents, etc. As a double-check, your .txt files should look perfectly fine when displayed with cat on lectura. Also, the file command should show them as "ASCII text" or maybe "ASCII English text", or something similar.

For the .txt problems on this assignment and others in this class we're always happy to look at your work in advance and give you a quick opinion on it. You can either mail it to 372s23, or turn it in and tell us you did. Don't wait too long, however, there may be a crunch as the deadline approaches.

Important: Create a symbolic link to the spring23/a2 directory

The directory /cs/www/classes/cs372/spring23/a2 on lectura has a couple of shell scripts, turnin and run-bench, and other files that you'll need to use to complete the assignment. The instructions in this write-up assume that you're working on lectura AND that you've made a symbolic link to the spring23/a2 directory.

Imagining that you're maintaining your assignment 2 files on lectura in a directory named ~/csc372/assignment2, here's how to make that symbolic link in the right place and then examine it with ls:

% cd ~/csc372/assignment2
% ln -s /cs/www/classes/cs372/spring23/a2 a2
% ls -l a2
lrwxrwxrwx 1 whm root 33 Jan 16 14:57 a2 -> /cs/www/classes/cs372/spring23/a2

If we now do ls -l a2/ (don't forget the slash!), we'll see files that are actually in /cs/www/classes/cs372/spring23/a2: (files present, sizes, and dates may vary)

% ls -l a2/
total 19
-rw-r--r-- 1 whm whm 1026 Jan 16 21:32 bench-avg.py
-rw-rw-r-- 1 whm whm  157 Jan 16 22:31 delivs
-rwxr-xr-x 1 whm whm 1032 Jan 16 20:58 run-bench
-rwxr-xr-x 1 whm whm  801 Sep 16 20:34 test-bench
-rwxr-xr-x 1 whm whm  232 Jan 16 14:52 test-benches
-rwxrwxr-x 1 whm whm  495 Jan 16 14:53 turnin
-rw-r--r-- 1 whm whm 1917 Jan 16 22:12 whm-bench.txt
-rw-r--r-- 1 whm whm 4294 Jan 16 16:15 whm-ten-things.txt

If you haven't worked with symbolic links, see slides 134+ in my 352 UNIX slides.

Problem 1. (6 points) morefacts.txt

When covering slide 28 in the intro set I said a sentence or two about various languages of historical or current significance. For this problem I'd like you to find three languages that are not mentioned on that slide and tell me a sentence or two about each.

I'll compile all the answers and post them on Piazza. Follow this format for your answers:

  1. The full response for each language should be a single line of text.
  2. Begin the line with the language's name followed by the year it appeared and then a colon and a space, followed by one or more sentences with whatever you want to say.
  3. End each line with an attribution that is either "anonymous" or your name in some form, like "-- W. Mitchell" or "-- William M." or "-- whm".
As examples of both the format and the sort of thing I'm looking for, here are three of mine, one with anonymous attribution. I'll use cat and fold to display my morefacts.txt and then wc -l to demonstrate it's exactly three lines:
% cat morefacts.txt | fold -75
Ada 1980: The DoD's attempt to have one language for military embedded syst
ems, instead of 450. -- William Mitchell
Java 1995: The most rapidly adopted language of all time.   In Spring 1997
I gave one lecture in 372 about Java as a rising language; by Fall 1998 it
was being taught in 127A. -- William Mitchell
Scala 2003: Proof that Germans should stick to beer and BMWs. -- anonymous
% wc -l morefacts.txt
3 morefacts.txt

Feel free to use Google, Wikipedia, etc., for research on this question but needless to say, no posts anywhere soliciting ideas.

Above I say, "the year it appeared" but that's often subject to debate. Feel free to believe Wikipedia or go with another source.

Just to be clear, you may use anonymous attribution to keep your classmates from knowing you wrote a particular entry but what you submit must be original, not something you found on the net.

Some of you may find that you share a first, middle, or last name with a programming language.

Problem 2. (2 points) quote.txt

We started the class with a quote from Alan J. Perlis about programming languages. For this problem I'd like you to find a quote you like about programming languages by somebody other than Perlis.

Put the quote in quote.txt in any format you'd like. Here's an example using Perlis' quote:

% cat quote.txt
"A language that doesn't affect the way you
think about programming is not worth knowing."
                            --Alan J. Perlis
Important: The quote must be about programming languages in some way, not just about programming or software development in general.

Problem 3. (3 points) jshell.txt

I think it's pretty stupid that it took the developers/maintainers of Java twenty years to provide a REPL for Java but that's the sad fact: JShell was released as part of JDK 9, in 2017.

For this problem I'd like you to spend maybe fifteen minutes experimenting with JShell and write a few sentences about your observations. You might talk about what you liked, what you expected to work but didn't, what you'd like to change, surprising behaviors, etc. If your 210 instructor showed or encouraged JShell, did it help or hurt when learning Java? If you didn't see JShell in 210, do you think it would have helped you learn Java?

If you've got Java installed on your machine, running jshell at a command-line prompt should start it up. If not, you can run jshell on lectura.

Some interaction with JShell is shown on slides 24 and 52 in the Haskell set. Here's a little more interaction with JShell:

% jshell
|  Welcome to JShell -- Version 11.0.16
|  For an introduction type: /help intro

jshell> 3 + 4
$1 ==> 7

jshell> $1 + "abc"
$2 ==> "7abc"

jshell> int i, j = 7;
i ==> 0
j ==> 7

jshell> /var
|    int $1 = 7
|    String $2 = "7abc"
|    int i = 0
|    int j = 7

jshell> /exit (or control-D to exit)
|  Goodbye
%
Note that /var shows variables and their values. /help shows all the commands.

Also, at the jshell> prompt, try typing "".s<TAB>

Problem 4. (15 points) counterparts.txt

Many elements of Python and Java can be called counterparts or said to be analogous. For example, Python's str and Java's String are counterparts. Java's System.out.println method is a counterpart to Python's print function.

For this problem you are to identify fifteen examples of Java and Python counterparts (and elements thereof) that differ significantly from each other.

Here are four examples of significant differences in Python and Java counterparts:

Notes:

Problem 5. (10 points) ten-things.txt

Find some programming language that you know essentially nothing about and, using examples of code, tell me ten things about that language. You may not write about Haskell, Prolog, or any LISP.

It's fine to copy code verbatim from wherever you find it, but include a list of your sources.

Take a look at a2/whm-ten-things.txt for an example of what we're looking for.

Use any sort of plain-text formatting that you want but number your ten items in some way.

Problem 6. (6 points) expressions.txt

Let's apply what you've learned about type/value/side-effect by analyzing three expressions. I suggest that you work through each on paper first, and then try it with the Python Shell and JShell. (Remember: I'd call them REPLs, not "shells".)

  1. In Java, given these declarations,
    int x = 3, y = 4, z;
    
    what's the type, value, and side effect(s), if any, of the following expression?
    z = x++ * --y
    
  2. In Java, given this declaration,
    float f = 0;
    
    what's the type, value, and side effect(s), if any, of the following expression?
    Math.abs(-(f += 'f'))
    
  3. In Python, assuming that the value of d is {False:[], None:True}, what's the type, value, and side effect(s), if any, of the following expression?
    d[d[not 1].append(x:=1)]
    

    Note: The := operator is new in Python 3.8. It has become known as the "walrus operator". If you haven't seen it before, do some Googling, and experiment some—it's definitely a good addition to your Python toolbox. It is formally described here.

    Problem 7. (3 points) decisions.txt

    In our first class we briefly talked about some of the language design decisions embodied in Python's print function:

    For this problem you are to identify three language design decisions embodied in EITHER the three-argument form of Python's range, i.e., range(start,stop,step) OR Python's str.split method.

    Problem 8. (10 points) bench.txt

    Your task in this problem is to devise a simple computation whose execution time depends on one parameter, then implement it in Java and Python, and then compare the performance of the Java and Python versions for three different cases of parameter values. In essence, you are to create a simple benchmark to explore the relative performance of Java and Python.

    Background

    In general, benchmarking is a complicated undertaking that's packed with pitfalls and has an interesting cast of characters. However, we're just going to have a little fun with benchmarking. I'll start by showing you my solution for this problem.

    Here's the computation I devised for my solution/implementation of this problem:

    For some number N, first build a string of the form "1,2,...,N,". Then print N, the total length of the string, the number of zeroes in the string and then the contents of the string, in single quotes.

    If N is 10, here's the expected output:

    N=10: 21 total chars, 1 zeroes, s: '1,2,3,4,5,6,7,8,9,10,'
    
    If N is 7,000,000, here's the expected output, with about 55 million characters elided from the middle:
    N=7000000: 54888896 total chars, 4088895 zeroes, s: '1,2,3,4,5,6,7,8,9,
    10,11,12,13,14,15,16,17,18,19...6999995,6999996,6999997,6999998,6999999,
    7000000,'
    

    Here is my Java implementation of the computation:

    public class bench {
        public static void main(String args[]) {
            //int Ns[] = {1, 3, 10}; // small Ns for testing
            int Ns[] = {1_000_000, 3_000_000, 7_000_000, 100_000, 200_000};
    
            StringBuilder sb = new StringBuilder();
    
            int N = Ns[Integer.parseInt(args[0])-1];
            for (int i = 1; i <= N; i++) {
                sb.append(i);
                sb.append(',');
                }
    
            String s = sb.toString();
            int count = 0;
            for (int i = 0; i < s.length(); i++)
                if (s.charAt(i) == '0')
                    count++;
    
            System.out.format("N=%d: %d total chars, %d zeroes, s: '%s'\n",
                N, s.length(), count, s);
        }
    }
    

    Here is my Python implementation of the computation:

    import sys
    
    #Ns = [1, 3, 10] # small Ns for testing
    Ns = [1000000, 3000000, 7000000]
    
    s = ""
    N = Ns[int(sys.argv[1])-1]
    for i in range(1,N+1):
        s += str(i) + ","
    
    count = 0
    for i in range(len(s)):
        if s[i] == '0':
            count += 1
    
    print(f"N={N}: {len(s)} total chars, {count} zeroes, s: '{s}'")
    

    Let's time the Python version for the first case: N=1,000,000. (Examine the code to see how the command-line argument 1 is used to select the first element of the list Ns.)

    % time python3 bench.py 1 >p
    
    real    0m1.171s
    user    0m1.121s
    sys     0m0.025s
    

    The first non-blank line of time output, with real 0m1.171s, is sometimes called the "wall-clock" time—it's the amount of time you'd have seen pass if you'd been watching a clock on the wall. The third line, sys 0m0.025s, is the amount of time the operating system spent doing work on your behalf, i.e., the time spent in "kernel mode". The middle line, user 0m1.121s is the line we're interested in. It shows how much time we spent in our Python code: 1.121 seconds. (User time does include compilation into Python bytecode, but it's pretty negligible.)

    The redirection >p causes the Python program's output to be sent to a file named p (for Python).

    Because the time command is built into shells (try type time), its output can vary from shell to shell. The above output is from Bash. Here's what we might see with zsh:

    % time python3 bench.py 1 >p
    python3 bench.py 1 > p  1.14s user 0.02s system 99% cpu 1.166 total
    

    Let's time the Java program for the first case. Instead of just using java bench.java to compile and run, we'll compile it first with javac, since Java compilation time is somewhat significant. (Try time javac bench.java to see how long the compilation takes.)

    % javac bench.java
    % time java bench 1 >j
    
    real    0m0.199s
    user    0m0.212s
    sys     0m0.060s
    

    Comparing the user times from the two runs, we see that the Java version ran 5.29 (1.121/.212) times faster than Python.

    We should also be sure that the Java and Python versions produced identical output. We'll use cmp for a yes/no test:

    % cmp j p
    %
    
    A behavior common among UNIX tools is for them to produce no output if an operation was successful. (In 352 I say, "Silence is Golden".) Above we see no output from cmp j p, so j and p are identical.

    To see what cmp shows if the files differ, let's compare bench.java and bench.py:

    % cmp bench.java bench.py
    bench.java bench.py differ: char 1, line 1
    

    Let's look at the sizes, just for fun:

    % ls -l j p
    -rw-r--r-- 1 whm staff 6888949 Jan 16 13:50 j
    -rw-r--r-- 1 whm staff 6888949 Jan 16 13:52 p
    

    If a cmp fails, or I get a "diff" with diff, I'll often first look at the sizes of the output files. One of the files might be empty, for example.

    Let's have a shell script do all the work of running both Java and Python versions, extracting timings, ensuring identical output from both versions, and more. It also runs javac bench.java if needed.

    If this were 352, I'd have you write that script yourself but since this is 372, I'm supplying it for you. It's in a2/run-bench. Assuming you've made an a2 symlink, you can type cat a2/run-bench to read the code, and I hope you do!

    Let's run the script. We'll give it one argument, a 1, to indicate we want to run the first case. (That argument, 1, is in turn passed along to java bench 1 and python3 bench.py 1.)

    % a2/run-bench 1
    Java:   0m0.360s
    Python: 0m1.666s
    

    Just so you know what it looks like, here's an example of what's shown if there's a run-time error in bench.java:

    $ a2/run-bench 1
    Java:   0m0.013s
    Python: 0m1.059s
    cmp: EOF on j which is empty
    =====Java errors=====
    Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index -9 out of bounds for length 5
        at bench.main(bench.java:9)
    
    A Python error would be similar.

    Timing can vary from run to run. A good practice is to run a benchmark several times and see if there's any significant variation in the times reported. To facilitate that good practice, run-bench accepts a second, optional argument that indicates that multiple runs should be made.

    Let's run the first case three times:

    $ a2/run-bench 1 3
    Java:   0m0.339s
    Python: 0m1.817s
    
    Java:   0m0.342s
    Python: 0m1.745s
    
    Java:   0m0.366s
    Python: 0m1.691s
    

    Sure enough, we do see some slight variation, 0.339-0.366 seconds for the Java version, and 1.691-1.817 seconds for the Python version.

    We have one more tool: a2/bench-avg.py. It reads run-bench output, echoes it, and then prints some statistics on performance. Let's "pipe" the output of run-bench into bench-avg.py:

    $ a2/run-bench 1 3 | python3 a2/bench-avg.py
    Java:   0m0.353s
    Python: 0m1.728s
    
    Java:   0m0.444s
    Python: 0m1.789s
    
    Java:   0m0.409s
    Python: 0m1.822s
    
    Java was 4.43x faster than Python (0.402s vs. 1.780s for 3 runs)
    
    We see that the mean time for the three Java runs was 0.402 seconds versus a mean of 1.780 seconds for the three Python runs, showing a 4.43x speed advantage for Java.

    Let's try seven million for N via case 3, and run it four times:

    $ a2/run-bench 3 4 | python3 a2/bench-avg.py
    Java:   0m0.908s
    Python: 0m14.167s
    
    Java:   0m0.821s
    Python: 0m13.100s
    
    Java:   0m0.959s
    Python: 0m13.314s
    
    Java:   0m0.922s
    Python: 0m13.046s
    
    Java was 14.86x faster than Python (0.90250s vs. 13.40675s for 4 runs)
    

    We see that for this benchmark, Java's advantage seems to increase as N grows.

    Your Task

    For this problem you are to:

    Deliverables

    There are three deliverables for this problem: bench.java, bench.py, and bench.txt.

    bench.txt is a plain text file that should contain:

    1. A brief description of your benchmark computation.
    2. A transcript of running these three Bash commands:
      a2/run-bench 1 3 | python3 a2/bench-avg.py
      
      a2/run-bench 2 3 | python3 a2/bench-avg.py
      
      a2/run-bench 3 3 | python3 a2/bench-avg.py
      
    3. A brief section with an interesting observation or two. If you have trouble finding something interesting, let us know!
    Carrying through with my example benchmark, you can see what I created for bench.txt in a2/whm-bench.txt.

    Problem 9. (ZERO points) quickrepl.txt

    After working with languages that have a REPL available, you may find yourself really wanting a REPL when you go to learn a language that doesn't have one. One approach to a REPL is to integrate it with a language's implementation, but that typically requires a good understanding of that implementation. Another approach is a completely standalone REPL, such as once provided by javarepl.com, but that can require a lot of code—wc showed about 8,000 lines in Java source files for that system.

    For this problem your challenge is to come up with a quick approach to write a simple REPL for a language. Your quick REPL needn't be nearly as good as ghci but it should be able to do the sorts of things you do when learning a language, such as evaluating expressions and showing types. Think about an approach that provides a lot of functionality for relatively little effort—something you could implement in a day, and not be late for dinner.

    I'm not asking you to write any code for this problem; just describe how you might quickly produce a simple REPL for a language of your choice.

    Python, Ruby and other languages have an "eval(s)" function that executes the code contained in a string s. Do not rely on such a capability in your quick REPL.

    Note: This is a hard problem that requires some creativity and cleverness, so I'm making it worth zero points, but I'm curious to see who'll do it anyway! For this problem you are free to work in any groups of any size, but for me to possibly be impressed with your answer you'll have to specifically state that you came up with whatever you did all by yourself/yourselves—no Googling.

    If you work in a group, all members may submit identical copies of quickrepl.txt, but it should include a list of group members.

    Problem 10. Extra Credit observations.txt

    Submit a plain text file named observations.txt with...

    (a) (1 point extra credit) An estimate of how many hours it took you to complete this assignment. Put that estimate on a line by itself, like this:

    Hours: 3.5
    
    There should be only one "Hours:" line in observations.txt. (It's fine if you care to provide per-problem times, and that data is useful to us, but report it in some form of your own invention that doesn't contain the string "Hours:". Thanks!)

    Feedback and comments about the assignment are welcome, too. Was it too long, too hard, too detailed? Speak up! I appreciate all feedback, favorable or not.

    (b) (1-3 points extra credit) Cite an interesting course-related observation (or observations) that you made while working on the assignment. The observation should have at least a little bit of depth. Think of me thinking "Good!" as one point, "Excellent!" as two points, and "Wow!" as three points.

    Note: I'm looking for quality, not quantity!

    Turning in your work

    Here's the full list of deliverables for this assignment:
    morefacts.txt
    quote.txt
    jshell.txt
    counterparts.txt
    ten-things.txt
    expressions.txt
    decisions.txt
    bench.java
    bench.py
    bench.txt
    quickrepl.txt (zero points—it's optional!)
    observations.txt (for extra credit)
    

    Note that all characters in the file names are lowercase.

    Do not include your name, NetID, etc. in your .txt files—I like reading answers without knowing who wrote them.

    Use a2/turnin to submit your work. It creates a time-stamped "tar" file (similar to a zip file) that contains the deliverables listed above and then uses the system-wide turnin program to actually turn in that tar file.

    Assuming you've made that symbolic link, you should see something like the following when you run a2/turnin:

    % a2/turnin
    Warning: quickrepl.txt not found
    ======== contents of a2.20230116.223543.tz ========
    -rw-r--r-- whm/whm         387 2022-08-25 16:53 morefacts.txt
    -rw-r--r-- whm/whm         134 2022-08-25 16:53 quote.txt
    -rw-rw---- whm/whm        1170 2022-08-25 16:53 jshell.txt
    -rw-rw-r-- whm/whm         851 2023-01-16 22:35 counterparts.txt
    -rw-r--r-- whm/whm        4294 2023-01-16 22:33 ten-things.txt
    -rw-rw-r-- whm/whm          58 2023-01-16 22:35 expressions.txt
    -rw-rw-r-- whm/whm         420 2023-01-16 22:35 decisions.txt
    -rw-r--r-- whm/whm         685 2023-01-16 22:34 bench.java
    -rw-r--r-- whm/whm         320 2023-01-16 22:34 bench.py
    -rw-rw---- whm/whm        1917 2023-01-16 22:34 bench.txt
    -rw-rw---- whm/whm         483 2022-08-25 16:53 observations.txt
    ======== running turnin ========
    Turning in:
     a2.20230116.223543.tz -- ok
    All done.
    
    Notice the first line of output: "Warning: quickrepl.txt not found". It indicates that the deliverable quickrepl.txt was not found. We'll imagine that because that quickrepl.txt is a zero-point problem the student elected to not do it.

    The contents of the "tar" file that was created, named a2.20230116.223543.tz, are shown after that warning. Finally, the system-wide turnin program is run and its output is shown—we can see that the tar file with the deliverables was turned in ok.

    You can run a2/turnin as often as you want. We'll grade your final non-late submission. We can add a -ls option to a2/turnin to show what's been turned in:

    % a2/turnin -ls
    .:
    total 17
    -rw-rw---- 1 whm cs372s23 1065 Jan 16 14:12 a2.20230116.141247.tz
    -rw-rw---- 1 whm cs372s23 3432 Jan 16 22:35 a2.20230116.223543.tz
    

    If you're curious, and I hope you are, look at a2/turnin. Remember that late submissions for assignments are not accepted, and that there are no late days; but if circumstances beyond your control interfere with your work on this assignment, there may be grounds for an extension. See the syllabus for details.

    Note that because each run of a2/turnin creates a new file, it can be used for a low-tech backup/snapshot of your work.