Modern parallel file systems achieve high performance using a variety of techniques, such as striping files across multiple disks to increase aggregate I/O bandwidth and spreading disks across multiple servers to increase aggregate interconnect bandwidth. To achieve peak performance from such systems, it is typically necessary to utilize multiple concurrent readers/writers from multiple systems to overcome various singlesystem limitations, such as number of processors and network bandwidth. The standard cp and md5sum tools of GNU coreutils found on every modern Unix/Linux system, however, utilize a single execution thread on a single CPU core of a single system, and hence cannot take full advantage of the increased performance of clustered file systems. Mcp and msum are drop-in replacements for the standard cp and md5sum programs that utilize multiple types of parallelism and other optimizations to achieve maximum copy and checksum performance on clustered file systems. Multi-threading is used to ensure that nodes are kept as busy as possible. Read/write parallelism allows individual operations of a single copy to be overlapped using asynchronous I/O. Multinode cooperation allows different nodes to take part in the same copy/checksum. Split-file processing allows multiple threads to operate concurrently on the same file. Finally, hash trees allow inherently serial checksums to be performed in parallel. Mcp and msum provide significant performance improvements over standard cp and md5sum using multiple types of parallelism and other optimizations. The total speed-ups from all improvements are significant. Mcp improves cp performance over 27x, msum improves md5sum performance almost 19x, and the combination of mcp and msum improves verified copies via cp and md5sum by almost 22x. These improvements come in the form of drop-in replacements for cp and md5sum, so are easily used and are available for download as open source software at http://mutil.sourceforge.net.