art with code

2010-08-19

Random programming ideas

  • Programming is about gathering knowledge about a piece of code. What does it do, what are its limitations, how does it perform, where can it be used, what is it doing right now, how do I extend it, how much memory does it use, how do I know that it works right, etc.
  • Testing, logging, documentation and profiling as an integral part of a language. Parse each function and module into a {body, tests, docs, logstream, profilestream}-struct that can be fiddled with at runtime.
  • Automatically test a function based on its type. Call it with different args, log the results and summarize detected patterns to give programmer a quick description of what the function does and what its time-space-complexity looks like. Add a way to flag found properties as want/do not want and check that subsequent versions of the function fulfill those (basically autogenerating QuickCheck rules).
  • Typed UNIX pipes. Use HTTP headers to tell the recipient what's coming through.
  • Auto-generate type->type-pipelines by putting conversion functions in a graph, run graph search on it with edge weight given by amount of information lost in each conversion (and secondarily performance).
  • Semantic type information to encode the effects of a function. Two functions that purport to do the same thing may well have semantic differences: suppose two image scaling functions where one assumes linear color space and the other does gamma correction. Both do ((RGB w h), new_width, new_height) -> (RGB new_width new_height), but the results differ.
  • Concurrency based on HW-level message passing. Processors read and write in cache lines, cache lines can't be shared between two processors => use cache lines as messages.
  • Array programming-like parallel array computation lib. Element-wise operations on N-wide vectors are easy to parallelize and vectorize: divide N by the amount of processors and the SIMD width. Needs to compile array-level pipes ((f . g . h) A) into element-level pipes (map (f' . g' . h') A) to maximize arithmetic per memory access. Maybe Data Parallel Haskell does this stuff already.
  • Uniform batteries-included interface for different data structures. Yeah, doable with typeclasses. Yeah, something like OCaml Batteries Included Enums. You can reduce data structures to either empty, fold and unfold (for streaming data structures) or alloc, length, get and set (for random access structures).
  • Data structure -generic regular expressions (i.e. not tied to just strings), are they useful? See proof of concept. What you need is a fold over the data structure (and I guess that most data structures are foldable, how else would you free the memory their elements use?)
  • Instruction set with bounded & bordered LOADs and STOREs: LOAD reg addr minAddr maxAddr [borderValue]. Throw hardware exception if a bounded LOAD goes out of bounds, load borderValue if a bordered LOAD goes out of bounds. Bordered STORE writes to the address in borderValue instead. Faster than two compares and JMP? More compiler-friendly? Eliminates buffer overflows? (Random literature search: Security improvement in embedded systems via an efficient hardware bound checking architecture, Checking Array Bound Violation Using Segmentation Hardware, A Comprehensive Approach to Array Bounds Check Elimination for Java, x86 segmented access mode, some mainframe architectures are said to have something like that, mysteries of life.)

No comments:

Blog Archive