Software_Description


Pitrex software is currently under development in C. Some early development work by Kevin Koster was performed with the PiCore outlink Operating System (based on Tiny Core Linux) V. 9.0.3, but the primary development platform is now the Raspbian outlink distribution V. 9.9 on a Raspberry Pi Zero W. This was chosen for two main reasons: firstly, it generally makes for a quicker development loop if you can compile/edit/run on the same system, especially when the alternative involves swapping media or cabling; and secondly, we want our users to be able to develop games etc for the system, and it makes supporting a development environment much easier if everyone has the same system, which would be the case if they worked on the Pi Zero itself using a development environment we created and supplied along with the system. This also has the advantage that the supplied system can rebuild itself in situ.

PiTrex I/O Library

The pitrexio-gpio library contains functions for initialising the Pi's GPIO hardware for use with the PiTrex Discrete cartridge, and for reading and writing to the 6522 VIA IC inside the vectrex. Additional reading and writing functions are included using the IRQ LATCH feature as described in the Hardware_Description.

GPIO operations use the bcm2835 library outlink.

Latest version is: pitrexio-gpio outlink V. 0.2.3

Optimisations

Some optimisations are possible using the PiTrex Discrete hardware:

Batch Reads and Writes
The PiTrex_Discrete read and write operations may be optimised when multiple Reads, or multiple writes, are performed in a row. Currently the functions in pitrexio-gpio.c are written with the assumption that every read amy be following a write, and every write following a read. As such, operations to set the GPIO pins as inputs and outputs corresponding to the operation are repeated if a write follows a write, or a read follows a read. These could be prevented if software is written to automatically batch read or write operations that occur sequentially, and use a quicker read or write routine after the first read or write.

Expanding on this, there is additional potential for this concept when the Raspberry Pi Zero can be guaranteed to respond to the Low-to-High transition (rising edge) of RDY within one cycle of the Vectrex 1.5MHz clock (E), such as when running software in a Real Time environment. Here, the PiTrex LATCH EN input can be held permanently High and the new data and address bus states for following read or write operations can set on the GPIO immediately following the rising edge of RDY. All intermediate steps can therefore be eliminates between sequential reads or sequential writes.

IRQ LATCH
This is a feature implemented in the PiTrex_Discrete design for performing operations after Interrupt Request (IRQ) signal received from the VIA. Not to be confused with interrupts generated internally within the Pi which might be used for code optimisation, as discussed below.

The vector drawing routines need accurate timing because the end position of the drawn line varies as the software execution is randomly delayed due to other processes running on Linux (10uS may equal 10% difference) that will affect the positioning of each redrawn screen image, with the exact lines always ending up in slightly different places. This is visible in the result of the early write test program, which uses the Pi0 internal timer via the usleep() function. This means using internal timer 1 within the VIA chip in the Vectrex, and using the PiTrex hardware to respond to the IRQ signal set to be triggered by the overflow of this timer, without having to wait for the Pi0 to respond. The PiTrex hardware can be pre-configured by the Pi0 to perform a write that enables #BLANK after the IRQ signal is received, thereby turning off the CRT beam before a bright spot is left at the end of the line.

The implementation of this in the PiTrex discrete design is described in the Hardware_Description.

Event Detection Register
Hardware detection of signal states. Used for detecting the state of RDY.

This is required for using interrupts for the Raspberry Pi code as described below. Performance is also improved by polling this register instead of the state of the RDY input directly.

Working with Raspbian 9.9 and PiCore 9.0.3 (Tiny Core Linux derivative). Issues were found using the EDR with version 9.4 of Raspbian, but appear to have since been resolved. The USE_EDR compile-time option should be set unless these issues arise, to take advantage of the performance improvement.

Interrupts
To allow for asynchronous communication with the Vectrex bus, the PiTrex Discrete hardware requires a Ready (RDY) signal to be monitored in order to detect when a read or write operation has completed, before data can be sampled (for a read), or a new operation begun.

Currently the state of RDY input is polled (either directly or by checking the status of the EDR) in order to detect this state. If an interrupt system were used, as is supported by the BCM2835 when using the EDR, the program running on the Pi, such as an emulator, could continue execution while waiting for this signal to be received.

A document describing the use of interrupts under Linux with the Raspberry Pi was found outlink.

The examples are for almost exactly what needs to be done reading the RDY signal from the PiTrex. However it also concludes by pointing out a fatal problem - the process of usefully handling interrupts under Linux (which is basically creating a new thread to wait on the interrupt while execution continues on in the main thread) takes so long on the Pi that it makes the idea unworkable for the sort of regular operation that we require. The author's tests (with the Pi Zero) indicate that delays of around 100ms would be introduced for, in application to the PiTrex, every read or write. So even if there are optimisations out there, as suspected (but undiscovered) by the author, there would be a very long way to go before it worked fast enough to suit our application, let alone improve performance over polling.

The author suggests that with a multicore system it would be faster to use one core just to poll the input while execution continues on another. For Linux with the RPi0 though, it looks like the methods already implemented for polling RDY are as good as we're going to get.

Another option is writing a Linux kernel driver outlink. That thread also references some much better figures for Pi response time, though still a bit slow and the test conditions aren't clear. Maybe the kernel driver route is something to look into later, I'm inclined to think it can't be that hard, but then such things are easy to assume.

This test of real-time interrupt performance outlink claims a (2.1 +/- 0.3)us interrupt response time running real-time on the Pi3. That first article claims (1.0 +/- 0.5)us mimimum while polling an input, so roughly 1us is required by the hardware for the interrupt handling. So the lag under Linux is all due to the extra processes involved in starting/stopping the threads, or some other higher level problem.

That extra 1us might wipe out the advantage anyway, because RDY only lasts for one cycle of E, which at 1.5MHz = 600ns. So while you gained 600ns of extra processing time while not polling RDY, there's now an extra 1us before the next read/write can be performed. This really comes down to where the bottle-neck is, but there may not be very much to be gained in any case. Least of all under a multitasking OS.

Considering this, it has been decided not to implement interrupt support for detecting the RDY signal, at least for the moment.

Test Software

Kevin Koster wrote a number of basic test programs to confirm the correct operation of the PiTrex Discrete hardware. The first of these was the write test program that attempted to draw an image on the Vectrex monitor. Written without a complete understanding of low-level drawing routines on the Vectrex, and relying on the Pi0's internal timer for timing (which, as discussed regarding the IRQ LATCH feature, is subjected to random delays when running under Linux) it fails to produce a clean image. Other test routines confirm the functionality fo reading from the VIA.

All use the pitrexio-gpio PiTrex I/O library described above for performing read and write operations to the 6522 VIA chip in the Vectrex.

Write Test


Example of the write test program running

Although not very functional in it's initial form, the write test program outlink was written so as to form a potential basis for development of a general purpose vector drawing library.

The vector drawing routines in this program were largely based on the low-level vector drawing description in Drawing Vectors Differently outlink, part of Malban's Vide documentation.

Further notes from an email written by Kevin Koster during development:

I spent a very long time getting my head around Vector drawing on theVectrex, and how it relates to the PiTrex. In the end I've concluded that using the Vectrex internal Timer 1 for the scale factor is stillthe best option.

I concluded that there are two practical approaches to drawing vectorsfrom the Pi. In my test program I've associated them with the functions vectordrawtopoint and vectordrawtolength. The first sets the X and Yvoltages to correspond with the end point of a line, then waits long enough that it knows the line has been drawn fully (integratingcapacitors fully charged to set voltage) before reasserting #BLANK. The second (not written yet) sets X and Y to the maximum values that wouldproduce a line at the required angle, then times the draw (#RAMP) to determine how far in that direction it gets before it is stopped.

The vectordrawtopoint timing is non-critical because and delay indisabling #RAMP will occur after the beam has stopped moving anyway.

However to get to this point, the wait time will always be constantregardless of the length of the line to be drawn, because it is determined by 3x(RxC) (the "3x" is a guesstimate, it may be less ifthe beam movement is more sensitive than I've assumed) as a result of the capacitor in the integrator having to charge to the full suppliedvoltage to reach its final position. At 3x(RxC) this means that every movement will take approximately 300uS, and that works out to a maximumnumber of lines (or repositionings) at 50Hz update equal to 66 (though faster draws may be possible with constantly updated co-ordinates tocreate bendy lines, I gather that's what's being done here outlink.

vectordrawtolength needs accurate timing because if the end of the draw varies as the Pi is delayed (10uS may equal 10% difference) that will affect the positioning of each redrawn screen image, with the exactlines always ending up in slightly different places. Presumably this would look blurry compared to an accurately positioned beam. This meansusing the internal timer 1, which also means catching its end interrupt soon enough to have to Pi enable #BLANK before a bright spot is left atthe end of the line.

Anyway, so far I choose (partly because I hadn't finished thinking all ofthis through) to design the test program to use vectrexdrawtopoint for starters.

I haven't gone into the issues of drift in the Vectrex integratorcircuitry because I didn't get around to looking into it fully, but that's another thing to consider (and I sort-of have in the pseudocode but not inthe real test program code).

The pseudocode for the two approaches is below:

  • Receive start and end coordinates of the line to be drawn (from Emulator etc.). - Check if start is the same as the last end (or maybe close enough to?), in which case we can skip start.


vectordrawtopoint:

Start Draw:

  • Assert ZERO
  • Wait equivalent to 70 vectrex CPU cycles, then disassert ZERO
  • Set Z
  • Set Y value in sample and hold circuit
  • Set X value.
  • Set Scale timer to generous approximate drawing time required (calculate using internals.txt math?)
  • Enable #RAMP and wait on interrupt from Scale timer.
  • On interrupt, zero Pi-side timer for positional drift in the Vectrex's analogue circuitry.


Continue Draw:

  • Check drift timer not expired (if so, use start routine)
  • Set Z (if changed?)
  • Set Y
  • Set X
  • Set Scale timer
  • Disable BLANK (ideally directly, instead of messing with the shift register which probably doesn't have a equivalent in the emulator output)
  • Enable #RAMP and wait on interrupt from Scale timer.
  • On interrupt, enable BLANK.


vectordrawtolength:

For drawing all connected vectors within maximum allowable drift time:

  • Set Z
  • Calculate angle of vector to be drawn.
  • Set Y to furthest possible point at that angle
  • Set X to furthest possible point at that angle
  • Calculate length of vector to be drawn in terms of drawing time (Scale)
  • Set Scale timer.
  • Disable BLANK
  • Enable #RAMP and wait on interrupt from Scale timer
  • On interrupt, enable BLANK
  • If finished drawing all connected vectors for the object: assert ZERO



Read Test
This test outlink sequentially reads all of the VIA registers and prints the result to the terminal on the Pi0.

Read/Write Test
This test outlink is for alternately reading and writing to the VIA. As not all values can be safely written to all of the VIA registers without setting the VIA in states that are invalid for the Vectrex circuitry, known safe values have to be written for testing. As an easy solution, the write test program was modified to include read operations and print the result to the Pi0 terminal. The write test is looped a few times before closing, and this turns out to be long enough for a brief flash of drawn vectors to be visible on the Vectrex screen.

Bouncer Test


Bouncer test program

Graham Toal successfully tested a program to draw the word "BOUNCER" on the vectrex screen. However it suffers from some glitches related to timing issues caused by Linux.
Video outlink

Vector Drawing Library

Chris Salomon has developed a general-purpose Vectrex_Interface library, which provides vector drawing function, and also allows use of the the Vectrex's controller inputs and audio. This may be used to accept vector plot data produced by video game emulators or custom Raspberry Pi software. It is designed to optimise vector output from emulators for the Vectrex hardware, by using relative positioning of connected vector plots wherever possible.


Tailgunner arcade game converted to run on the Raspberry Pi using Vectrex screen and controller


The vector drawing library was used in order to modify Graham Toal's port of the Tailgunner outlink arcade game to use the Vectrex display and controller. A calibration system allows manual adjustments to be made in order to compensate for variation in the analogue stages of the Vectrex hardware.

Performance on the Raspberry Pi Zero while running Linux has been limited, however a real-time build of the game for the Pi0 to boot into directly shows much better performance.
Tailgunner under Linux Video outlink
Tailgunner Real-Time Video outlink

Display glitches observed currently under Linux are due to delays in sequential vector drawing operations, and/or beam recalibration, caused by execution being unpredictably interrupted by Linux processes. These delays produce incorrectly drawn/positioned vectors to be drawn due to drift of the beam position.

Vectrex Emulation

Work is progressing with modifying the open-source VecX outlink Vectrex emulator software to redirect read and write operations directed to the emulated VIA, to the real VIA in a Vectrex console connected via the PiTrex cartridge. This should allow a performance test of the PiTrex interface, as well as an opportunity for equivalent functions to a traditional multicart to be implemented.


Vectrex boot screen with distorted text

Minestorm running via emulation on the Raspberry Pi

Chris Salomon successfully ran the VecX emulator with modifications to perform all VIA operations on the real 6522 VIA chip inside the Vectrex. Sound and controls worked well, and vectors displayed fairly well. The most noticeable problem is the display of raster text or graphics using the Vectrex BIOS routines, which is seriously distorted due to the Raspberry Pi not being able to time writes to the VIA shift register with the precision required. Improvements using the IRQ LATCH function of the PiTrex may be possible.
Minestorm Video outlink


Successful raster display with Karl Quappe which uses alternative raster drawing routines

Testing with the game Karl Quappe, which uses its own rastor drawing routines for displaying text, was largely successful. This indicates that raster display should be possible using other methods, though some display artifacts were still present.
Karl Quappe Video outlink

Update:

Vectrex emulation using the "direct" method (passing VIA commands directly to the real chip in the Vectrex) has now been significantly improved and displays very similarly to ROMs running normally on the Vectrex, including raster text.

Arcade Emulation


Black Widow running via PiTrex

Tests with arcade emulations of games utilising vector displays, such as those supported by the MAME software, are in progress. The vector drawing library will be used to draw on the Vectrex screen using vector data from the emulator, with a drawing process optimised for the Vectrex hardware.

Initial working implementations for 6502-based arcade games, and Cinematronics games, have been demonstrated. Tailgunner and Asteroids are in near-complete state including the substitution of the original sound effects with sounds compatible with the Vectrex's Programmable Sound Generator.

See Arcade_Emulation.

Sound


The PiTrex allows the Pi0 to access both of the sound generation methods available in the Vectrex (no that's not counting the buzz :) ). The AY3-8912 sound chip is conventionally used by vectrex games, and the Vectrex emulator running on the Pi0 successfully uses this to produce sounds using the original write instructions captured from the emulation and redirected to the physical chip inside the Vectrex.

Custom made games can similarly use the Vectrex's sound generator chip. Also some vector arcade games used the same chip (though usually more than one) so a similar method to that used for the Vectrex emulator could be used.

MAME, and likely many (all?) other emulators, use emulations specific to each sound chip used in original arcade hardware in order to produce a digital sound stream. This makes adapting to use the AY3-8912 sound chip difficult because there is no common software synthesizer interface which could be adapted to produce equivalent instructions to the AY3-8912. The physical limitations of channels, filtering, and waveform generation, compared to other sound hardware used in arcade machines, would obviously also limit the accuracy of the output compared to the original arcade machine.

The alternative means of generating sound on the Vectrex is also available for use. This allows sampled sounds to be played via the Digital-to-Analogue (DAC) converter that is part of the vector drawing circuitry, and its use is demonstrated by the voice samples used in the original game Spike outlink. This however requires far more write operations than using the Vectrex sound chip, which may prevent sufficient vector drawing operations from being performed in the required time, or require an audio sampling frequency which is too slow to correctly reproduce the sounds. The DAC is also only 8bit, compared to modern 32bit PC sound cards. However note that samples used in chiptunes are commonly 8bit anyway, so that's unlikely to matter if only processing the output from emulated arcade sound generators.

Failing these approaches, a USB audio adapter and external speakers, or a bluetooth speaker device for the Raspberry Pi Zero W, may be used for sound output directly from the Pi0 using the standard sound interface of the emulator. This may be difficult for "bare metal" real-time implementations of the PiTrex software, due to the extra layers of software protocol support required.

The addition of a sound output from the PiTrex cartridge is also an option. This could involve adding an audio DAC to the hardware design, which could use the PCM audio output available from the GPIO header, or alternatively the PWM sound output method outlink could be used.

Hybrid Approaches

Assuming that the number of write operations required in order to produce sampled sound via the Vectrex DAC at a reasonable bitrate would not allow sufficient time for drawing vector graphics, some form of hybrid use of samples and the AY3-8912 Programmable Sound Generator (PSG) may be worthy of consideration.

Waveform analysis could be performed by the Pi0 on the audio streams output from an emulator such as MAME, either during execution of the game (probably a bad idea), or separately for the results to be pre-computed for later integration with the emulator. If this can not provide sufficient information to generate equivalent instructions to the PSG directly (likely it will just provide the frequency and volume of a noise), the emulation of the PSG from MAME or Vectrex emulators could be used to generate all possible waveforms with matching frequency/volume, then the deviation between each, compared with the arcade game sample, could be calculated. The instructions that produce the most similar sample from the PSG emulator can then be saved, and the arcade emulator modified so that when it would normally send the instructions corresponding to that noise to the sound emulation, it uses the pre-computed equivalent writes to the Vectrex PSG.

That would probably work better if a real AY3-8912 chip was used to record the samples, preferably not from a Vectrex but using a custom chiptune player device with an audio output line, such as one of the various DIY projects online.

Obviously the number of simultaneous sound channels is a hard limit even when it comes to approximations of more advanced sound chips using the Vectrex PSG. The DAC can be used in combination with the PSG to add pre-mixed sampled sound from the arcade emulator, such as only to be used when the PSG has run out of channels. Probably still not practical without disturbing vector drawing too much though.

Another crazy idea is to use the DAC with a limited number of writes synchronised perfectly with the output from the PSG, so that the PSG waveform is "moulded" into something more like the output of the emulator's sound emulation. This would only use DAC writes when they were specifically required, and therefore alow more time for vector drawing operations. However the complexity of calculating where the "moulding" of the waveform is required, synchronising perfectly with the PSG output, and allowing for efficient operation within the vector drawing code, probably makes this highly impractical.

User Interface

A user interface using the Vectrex screen and controller is to be developed.

Raspberry Pi Environment

The following instructions may be added to the end of the config.txt file in the first directory of the SD card that the Raspberry Pi Zero is booted from:

 [ALL]
# Configure GPIO for PiTrex

#Inputs
gpio=0-5,16-24,26-29=ip
#Outputs
gpio=6-13,25=op
#No pull-up/down on RDY
gpio=24=np



This configures the GPIO pins for use with the PiTrex before the interface is initialised using the vectrexinit function in pitrexio-gpio. This is recommended to reduce the risk of any invalid I/O states being present during start-up. RPi documentation on GPIO config can be found at https://www.raspberrypi.org/documentation/configuration/config-txt/gpio.md outlink and https://www.raspberrypi.org/documentation/configuration/config-txt/conditional.md outlink

The kernel booted into by the Pi can be determined by the state of any GPIO pin, using an entry such as this in config.txt:

 [GPIO2=1]
kernel raspbian.img
[ALL]
[GPIO2=0]
kernel tailgunner.img

Operating System

Currently Raspbian outlink is the primary target operating system to be used on the Pi0. This offers advantages including easy initial configuration, wide package support including emulators and their dependencies, and extensive documentation.

PiCore outlink Linux (based on Tiny Core Linux) has been proposed as an alternative target operating system by Kevin Koster. It offers advantages including lower system resource usage, faster start-up, and smaller download and SD card size/write-time requirements. It also resets the system to a configured initial state upon each boot, so it is easy to ensure a uniform and consistent software environment, which can be controlled through the package management system.

The software is expected to be easily ported to other Linux operating systems for the Raspberry Pi Zero.

As noted elsewhere in the software description, a Real-Time OS offers numerous performance advantages at the cost of increased difficulty in development of the PiTrex software. Many such environments, and implementations of video game emulators, exist. Related links may be found on the Useful_Links page.

Raspberry baremetal -> PiTrex

Raspberry boot process (general)

  • GPU loads (firmware loader) bootcode.bin - executes (from first FAT partition)
  • GPU loads start.elf – executes
  • GPU loads config.txt and cmdline.txt
  • GPU generates device tree (if not switched off by config.txt)
  • GPU sets the „graphic“ system…
  • GPU loads kernel.img (or other file denoted by config.txt) „normally“ to 0x8000 – but other locations can be specified by config.txt
  • GPU writes simple „init“ code to 0x0000 which ends with a branch to the loaded kernel
  • GPU releases ARM from reset
  • ARM starts from a reset at location 0x0000 (reset vector)
  • When ARM „reaches“ the loaded kernel address, registers r0-r2 are set with: r0 = #0, r1 = machine type, r2 = pointer to device tree blob

PiTrex baremetal boot

(non NOOBS!)
Due to the NOOBS partition layout NOOBS loads the kernel.img not from the first FAT partition. ATM the filesystem in piTrex baremetal can only access the first partition - this garbles things up a bit. Possibly the below mechanisms also work with NOOBS. But they would be wastly redundant. NOOBS is a bootloader menu system in itself. So what would you do? Chose raspbian and then a second bootloader the pitrex loader is loaded? One could probably configure noobs directly do boot up different pitrex kernels...

  • The piTrex „kernel“ is named as default: „kernel.img“ and loaded to 0x8000 (from start.elf)
  • baremetalEntry.S:
  • Registers r0, r1, r2 are saved to „save“ location (memory: 0x80, 0x84, 0x88) in case we need them for a „linux“ boot. On my pi, r2 points to an outer world region: 0x1BFE9C00 and does not collide with anything I ever do. If this changes in the future, I have to relocate the device tree block to a save location (which will be: 0x3e00000)!
  • Vectors (reset/IRQ/exceptions ...) are set up, 8 vectors starting from address 0x0000, followed by an indirect jump table, again 8 data entries (all uint_32 –> 4 bytes) thus occupying the first (8+8)*4 = 64 bytes (0x40)
  • Stack pointers for different modes are setup to a total memory region: 0x3f08000 – 0x3f48000
  • VFP, MMU (page table at 0x4000), Data Cache, Instruction Cache, unaligned memory access, and cycle counter access is switched on
  • Control is given to „C“ -> kernelMain()
  • kernelMain():
  • Clock speed of ARM is set to 1000Mhz / UART is initialized
  • Control is given to -> main()
  • SD card initialized and the first FAT partitition mounted
  • „loader.pit“ is loaded to memory address: 0x4000000 and started


Changes to lib-rpidmx512

  • page table is relocated to 0x4000
  • MEM_COHERENT_REGION is relocated to 0x3f00000
  • PiTrex baremetal loader

  • control is immediately given to „C“: loaderMain()
  • again the first found FAT partition is mounted
  • vectrexinit(1): is called to initialize the piTrex HW
  • v_init(): is called to initialize the Vectrex „API“
  • Menu items are displayed – each program is another file on the SD, that is „kernel“-compatible (e.g. loadable to location 0x8000 and self sufficent, each file could be renamed to „kernel.img“ and be started as a standalone baremetal - kernel)
  • upon selection the „program“ is loaded to 0x8000 and registers r0, r1, r2 are filled with the original values.
  • than control is given to the loaded program with a jump to address 0x8000
  • in anticipation, that no piTrex program is larger than about 61MB – loader.pit stays in memory ar location 0x4000000. If at some stage larger programs occur – the loader can be loaded to a higher address!
  • When a vectrex-reset is detected, the program jumps to 0x4000000 again and thus executes the loader "from start".


PiTrex baremetal "program"

  • again: each pitrex „program“ is „kernel“-compatible! (e.g. loadable to location 0x8000 and self sufficent, each file could be renamed to „kernel.img“ and be started as a standalone baremetal - kernel)
  • the „startup“ process is the same as the baremetal boot (in fact these are the same files!) up to where function main() is called.
  • function main() (as it should be) is the entry point of each „piTrex“ program (linked)
  • Usually what main() would do first:
  • mount the first found FAT partition
  • vectrexinit(1): is called to initialize the piTrex HW
  • v_init(): is called to initialize the Vectrex „API“
  • ...than the programs are on their own...