Overlay.doc A tiny introduction to Overlays ---------------------------------------------------------------------------- © 1997,98 THOR-Software. Read the licence at the end! $VER: Overlay Doc 1.5 (5.7.98) ____________________________________________________________________________ Abstract: This doc presents some features of overlay binary files of the AmigaDOS, their hunk structure and a standard overlay manager. In this doc, tradenames and trademarks of some companies and products have been used, and no such uses are intended to convey endorsement of or other affiliations with the documentation. ____________________________________________________________________________ Thanks to Jörg Riemer for the fruitful discussion about the overlay file structure and finding some mistakes in this doc. Thanks to Harry Sintonen for correcting another mistake in the doc about the magic word. ____________________________________________________________________________ i) What are "overlays" ? Overlay binaries are amiga dos executables that are, unlike the usual executables, only partially loaded from disk. Thus overlayed programs will usually occupy less memory and will load faster, because only the needed parts of the executable reside in memory, everything else is kept on disk and loaded only if needed. This mechanism is provided by the LoadSeg() and UnLoadSeg() pro- cedures of the operating system (the dos.library), a small startup- program that is responsible for locating the correct module to reload (the "Overlay-Mananger") plus the linker, which is needed to build the necessary references, i.e. creates the necessary information for the overlay manager which part of the code can be found where. This overlay mechanism is NOT unique to Amiga, and has been used in other computers as well. (In the antique Atari XL, for example, but also in the Macintosh computers and elsewhere). However, it is not very well known and not very well documented for the Amiga. See the literature [1,2] for more references. The standard overlay binary consists, like every other executable file, of a SINGLE output file that has been generated by the linker. However, internally, this file can be thought to have a tree-like structure, consisting of several "nodes" (not to be confused with hunks. A node may be build of several hunks, not just one. More on that topic below). To give an example, consider the following simple tree: overlay manager \ | |__ this is considered to be "one" node, | | the root node common part / / \ / \ node a node b <-- "overlayed" nodes The overlay manager and the "common part" are loaded immediately by the DOS and called the "root node". The overlay manager is used for loading the overlayed nodes, as described below, and is usually provided by the linker, i.e. you don't have to write it. The "common part" of your program consists of symbols (procedures, data structures, whatever) needed by all parts of the executable, and is, like the overlay manager, always loaded and not removed from memory unless the program quits. Thus the "common part" will contain the main() procedure of your program, which is called by the overlay manager in first place. Both together are considered to be ONE node, even though the code comes from completely different sources and have completely different purposes. Thus, a non-overlayed program consists - for "overlay purposes" - only of the root node. HOWEVER, the data in nodes a and b in this example WON'T be loaded by the dos, and won't waste memory. They will, instead, only be loaded if needed. Consider, for example, that your main program calls a procedure ATest() in node a. Then the overlay manager will notice AT EXECUTION TIME that this procedure is not yet in memory, and will load the node a from the disk, relocate the code and execute it after loading it. This loading is totally transparent to your code, no additional work is needed (this is prepared by the linker, the loading is done by the overlay manager). If you now call a procedure BTest() from node b, the node a is un- loaded, the memory occupied by it is freed and instead node b is loaded into memory. As you can imagine, this will safe some memory, esp. for BIG programs. However, because of this, you won't be able to call a procedure from inside of node a which resides in node b, since the calling procedure must be unloaded. This is usually checked by the linker. You call the nodes a and b the "childs" of the node above, in this case the "common part". This is called to be the "parent" of a and b, and in this case also the "root" of the tree, because it is the topmost node, except the overlay manager. And now a more complex example that is taken from real life: This is the overlay structure of the SetVNC program (which is part of the ViNCEd CON: replacement package and available at my home page, see below). It has a somewhat bigger tree: overlay manager \ | |-- again, one node. startup code & argument parsing / / | | | | / | | | | / | | | | / | | | | mounter job control help argument prefs loader line and saver printer / \ / \ / \ / \ shell direct editor gui commands | | online help In this case, the root node is the "startup code & argument parsing" node, which does the command line argument parsing. Depending on the argument line, one of the five parents of this node can reside in memory, exclusively. However, one node (the prefs loader) has by itself children that it might load on request. The following rules of loading and unloading apply: If a node must be loaded, all other children of the parent of this node plus all their children will be unloaded. The children of the new node WON'T be loaded. So if you request the "shell direct" commands, the "editor gui" and the "online help" are unloaded. If, after that, you call the "job control" node, the "prefs loader" and the "shell direct" commands are freed and replaced by the job control module. Because of these rules, the following restriction of invokation of nodes apply: You may call: a) each procedure of the parent node b) procedures in the parent of your parent and so on, up to the "common part" or "root" a) and b) says that you may call everything "on top" of your node. This is because all these nodes must have been loaded at invokation time of your node, and all procedures are present. The invokation types a) and b) are "ordinary" function calls in the sense that the overlay manager is not needed. c) Call functions of one of your children. This is a special "overlay" type call, and the overlay manager is needed here. It will find out if the appropriate children node is already in memory, and, if not, will replace or reload it from disk. You may NOT call: a) Functions in nodes that are not in parent nodes of your node, i.e. on top of your node, except direct children. It is even forbidden, using the standard overlay manager, to call these non-parent functions in an indirect way - the calling node will become unloaded when the parent node will invoke the overlay manager to swap in a different child. Thus, the returning parent function will return into the nirvana, i.e. the unloaded node. b) Children of your children. Hence, only DIRECT dependencies are traced by the standard overlay manager. The linker should be able, however, to detect this problem. And what about data references: You may refer to memory or data objects of a) and b) your parent or parents of your parent, up to the root level of the tree. This is again no problem, since the overlay manager is not needed to do this work. EXCEPT a) and b) NO OTHER REFERENCES ARE ALLOWED. YOU MAY NOT refer to DATA in your children! This is again a limitation of the overlay manager which assumes that all references to children nodes are code references. However, it's easy to overcome this limitiation if you write a stub procedure in your children that returns a pointer to the data object. Then call this function to get the pointer, thus invoking the overlay manager automatically: main() { my_struct *a; a=getptr(); .... } and in the overlayed node: my_struct *getptr(void) { return &my_data; } Some care must be taken, however. Since the data resides in an overlay module, THIS DATA WILL VANISH if you load another children of your node. Nobody will warn you about this! _____________________________________________________________________________ ii) Custom overlays As I said, the loading of overlayed nodes is done by the overlay manager, which, by itself, uses the LoadSeg() function of the DOS for the though part. However, you MAY write an own overlay manager with a custom load procedure. Since the data parts behind the root nodes of the overlay tree are never seen by the OS, YOU ARE COMPLETELY FREE in the design of your overlay mechanism. However, for this case, you need a custom linker that builds your overlay file and builds the custom structure needed by the overlay loader. You won't get any support by the OS for loading the overlay structures from memory, so all must done by hand. This method has its powers, and has also been done. For an example, some limitations of the tree like structure above have been broken by the overlay manager of DPaint II (Electronic Arts). This program uses a private overlay manager, but the standard file structure to encode the overlay tree. The EOA overlay manager can hold more than one node of one overlay level, and load all nodes simultanously into memory. This prevents unnecessary disk I/O if enough memory is available and the overlay mechanism is not required. The MAC overlay manager provides, by the way, a similar mechanism of keeping nodes, unless the OS runs out of memory. The default overlay manager is rather clumbsy compared to the highly sophisticated EOA and Mac-Os managers, although EOA stopped supporting it with DPaint V. Another example is the low-level debugger COP (also by the author of this article) with a much simpler overlay type structure. The overlay part is in this case the resident part of the debugger, and is only loaded if the debugger is not yet in the memory. As you can image, nice tricks can be done by custom overlays, e.g. games displaying a title while loading, or that keep pictures in the overlay nodes and so on. With an amiga, it is absolutely not necessary to put a dozen of files on your HD: Each data needed can be put in the overlay nodes of an overlayed file and can be loaded into the memory on request. However, custom overlays are always a big bunch of work since you have to develop everything by yourself: The linker, if you need a different file structure, and a private overlay manager, possibly a private segment loader, i.e. a LoadSeg() replacement. Because of these custom structures, overlays can't be crunched safely - the custom structure might need some changes if the length of the file and the position of the nodes change. And this custom structure can't be known by the cruncher. The good part is: This goes for viruses, too. So you may say that an overlayed program is very resistent against viruses. Either it works, or it is infected and does not work at all. _____________________________________________________________________________ ii) How to construct an overlay. As I said above, overlays are build by the linker. Not every linker will support them however, for example the public domain linker BLink of the software distillery won't. I won't go into detail here since this is described better in your linker manual, or in the literature [1]. To make it short, a WITH file is needed for the standard ALink (and also for other custom linker tools). It should not only contain the options, as usual, but also special keywords to start overlay generation. Here is how it looks like for the SetVNC example: ROOT SVC_Startup.o NEWOCV OVLYMGR SVC_OvrlyMngr.o OVERLAY SVC_Mount.o SVC_Prefs.o *SVC_ShellCMDs.o *SVC_Editor.o **SVC_EditHelp.o SVC_JobCtrl.o SVC_Help.o SVC_CLIHelp.o # The ROOT keyword specifies the root node of the overlay tree. This is the "common part" which is initially loaded. The NEWOCV keyword specifies the type of the overlay manager to be used. There are actually two standards: The old one for stack-based calls of C style functions, and the new one with uses registerized parameters and should be prefered. More about this topic later. The OVLYMGR option selects the overlay manager to be linked in front of your code. Some linkers don't support this option and use a overlay manager hard-wired into the linker; other linkers supply a default overlay manager if this keyword is omitted. The overlay manager itself is a standard object module (not an executeable!), see below for an example. The OVERLAY keyword starts the actual description of the overlay tree. It is terminated by the hash-mark in the last row. Each line under the OVERLAY statement is the name of one overlay node, plus some stars in front of the name, that give the "level" of the node in the overlay tree. No stars means children of the "root" note, one star'ed names are children of the node above the non-star'ed, and two-star names are children of the one-star name on top. Thus, these lines will produce exactly the overlay tree example of SetVNC in the first chapter. _____________________________________________________________________________ iii) How overlay files look like on disk. First difference: The HUNK_HEADER hunk of the binary contains the number of hunks, plus the first and the last hunk that must be loaded. Since hunks are counted from 0 up, the last hunk of a usual program is just the number of hunks of the program minus one. THIS IS NO LONGER TRUE FOR OVERLAYS! The third longword of the HUNK_HEADER is the maximal number of hunks that must be kept in memory at a time. The fifth longword contains now the number of the last hunk of the root nodes, i.e. the last hunk that is initially loaded - again counting from zero up. For short: HUNK_HEADER $000003f3 private_dummy 0 (must be here. Reserved for resident library uses, and also documented for this reason. However, this feature was removed in the V36 DOS. Keep it zero!) TABLE_SIZE n The maximal number of hunks that may reside in memory simultaniously. FIRST_HUNK 0 The first hunk to load. This MUST be the zeroth hunk, since this is the root node. LAST_HUNK m The last hunk to load, thus the last hunk of the root nodes, counting the nodes of the overlay manager, as well. SIZES[m+1] LAST_HUNK+1 longwords specifying the sizes of the hunks in the root. The first number is for hunk zero (thus the size of the overlay manager), the next is the size of the first hunk of your code node and so on. There are m+1 numbers (plus some special bits that can be set there to request CHIP or FAST memory. Bit 30 is CHIP, Bit 31 is FAST. If both bits are set, the next long word contains the memory type that is passed to AllocMem). So far for the HUNK_HEADER. The program starts with a HUNK_CODE. This code-hunk is NOT the first hunk of your code, but the overlay manager. Your code comes in the next hunk (beeing the second, or hunk one in this numbering). Since the dos starts the executables always at hunk zero, the overlay manager is CALLED FIRST and will start your code NEXT. Virus-Checker authors: BE WARNED. This calling ISN'T done thru a relocation entry. Instead, the BCPL hunk linkage provided by the DOS will be used, and the root node is started at its first byte. NO RELOCATION OF THE OVERLAY MANAGER RELATIVE TO THE HUNK #1 WILL HAPPEN. Dirty, but such is life. The overlay manager code is special, again. It is on the first sight just a code segment as every other code segment, but some data in it is read and prepared by the OS. First and finest, the second longword in this hunk MUST be the "Magic LongWord" $0000ABCD. It is actually ignored by the segment loader (i.e. LoadSeg()), but is needed by UnLoadSeg() to indicate that this hunk belongs to an overlayed file and somewhat more resources must be deallocated. The third to the sixth longword are usually defined as zero, but will be filled by the DOS with vital information for the overlay manager as soon as the root node got loaded. These datas, including the file handle and a pointer to the overlay table, will be simply deposited there and will overwrite everything else. Hence, you must provide these four zero long words to reserve the room. The first longword of the overlay manager is usually a jump around all this magic, thus something like $6000xxxx. The xxxx is the jump and may vary from overlay manager to overlay manager. Please do not depend on this size, nor on the jump at all. (But I wonder what should happen there instead... Overlayed devices and libraries are not supported, and how could they?) Everything behind the Magic Word and the four longs is free, in principle. Free in the sense that it is not parsed by the DOS at any time. However, there are some rules how to find out if a custom overlay manager is used or not - for example to be used for debuggers and other third party programs. The standard overlay managers keep here: 1) Another Magic Longword $00005BA0 ("A thing CMB gave to me and I don't know what it is.") 2) A seven character BCPL like string, saying "Overlay", thus $074f7665726c6179 The EOA overlay manager of DPaint I to IV doesn't identify itself as "standard" since these identification bytes are missing. Here, another standard ends. This information is, for example, used by COP (my debugger). If everything is present like this, a standard overlay is assumed. For my overlay manager (the code is presented below), the story goes on: 3) two empty longwords. They keep library pointers. 4) A C style string: "THOR Overlay Manager 1.1" The 1.1 is the version number, and may increase in future. Again, these standars in short (counted in longs): HUNK_CODE $000003e9 length n size of the hunk in long-words. jump-instr. $6000xxxx where xxxx may vary. MAGIC WORD $0000abcd A magic cookie. (Must be here and nowhere else) FH $00000000 Becomes the BPTR file handle. OVTAB $00000000 Filled by overlay table ptr by the dos. HT $00000000 Filled with hunk table ptr. GV $00000000 Filled with BCPL GlobVec. (More below...) LIBWORD $00005ba0 "Magic" lib word (no idea whatfor) STRING $074f7665 $726c6179 The BCPL String "Overlay" SysBase $00000000 Filled with ExecBase by THOR Overlay manager. DOSBase $00000000 Filled with base of dos.library by THOR Overlay. Version The "C-style" string: "THOR Overlay Manager 1.1",0 Beyond that: The code starts. So much for the overlay manager. Behind the manager, the root hunks of the code follow. They are usual hunks, with all the relocation table stuff like in every other executable. The root nodes terminate, with a standard HUNK_END $000003f2. Behind this HUNK_END, a HUNK_OVERLAY $000003f5 is required by the DOS to identify this file as overlayed. The data in this hunk will be loaded, but not parsed, by the LoadSeg procedure of the DOS. Decoding this data is, however, done by the overlay manager. The structure of the HUNK_OVERLAY is free - in principle. The only requirement is that the first long word of the hunk is the size of the hunk body in longwords minus one. Since the standard memory allocator of the dos library is used, one additional longword gets allocated to hold the size in bytes ("AllocVec style"), but this longword is not considered to be part of the overlay data structure; hence, the overlay manager will always receive a pointer to the next long word, as for all "AllocVec'd" memory. (Strictly speaking, the current implementation of LoadSeg() cares also about the memory attribute bits in this longword, so you may put the overlay data into CHIP MEM. I don't see any need for this and declare this as "software junk". Please don't make any use of this feature...) To conclude, if this longword is , then 4+*4 bytes are allocated to hold the hunk, the first longword is filled with the byte size, and *4 bytes are loaded from disk starting with the next longword. Its address is the beginning of the overlay data structure, and this address will be filled in. The hunk body contains the information needed by the overlay manager, e.g., where the overlayed nodes can be found, and which entries must be relocated. It's only parsed by the overlay manager itself, the DOS doesn't care about the contents of this hunk. The DOS segment loader stops parsing the file after encountering a HUNK_OVERLAY, so the part of the file behind this hunk is again free as it is never seen by the DOS, unless you invoke the DOS LoadSeg function again to load data from this part of the file. A standard overlay file will continue with the HUNK_HEADER of the first overlayed node in the overlay tree. Now, let's assume the file uses a standard overlay structure. The HUNK_OVERLAY will now contain two data tables: -The offset of the overlay table to the beginning of the hunk, given in longwords, -the ordinate table which will be filled in by the overlay manager and is not stored on disk -a dummy zero longword that makes life easier for the overlay manager. -the overlay table. The overlay table keeps track of which nodes of the overlay tree are actually loaded in memory. Thus, one entry for each "level" of the overlay tree is reserved here (more on that below). The overlay table contains informations about the overlay references, i.e. about references to overlayed segments that must be resolved by the overlay manager. If we call "h" the size of the longest path in the overlay tree down from the root to the bottom the "height" of the tree, including the root node, - the first longword of a standard overlay hunk will be h+1 - h-1 zero longwords follow, making up an empty ordinate table, no entry is reserved for the root node because this is anyways loaded all the time, - one zero termination longword comes next - and the remaining part is the overlay table. For the "SetVNC" example above, the tree height will be four (the overlay manager and the "common part" containing the startup code count together as one single "root" node), thus, - the first long is five, - four zero long words follow, the last is the terminator, - the overlay table follows Again, for short the structure: HUNK_OVERLAY $000003f5 OVLR_SIZE n number of longwords that follow, minus one. REQUIRED! (But not kept in memory) TREE_SIZE h+1 size of the tree, i.e. maximal "depth", counting the root node manager. TREE_PTRs [h] m longwords, containing zero. OVLR_DATA 8 long sized structures. The remaining part of the overlay table contains of 8 longword sized structures, keeping the references to overlayed functions, one for each reference to an overlay, i.e. references that must be resolved at execution time by the manager by loading data from disk. This structure looks like the following: FILE_POSITION: The zeroth longword contains the offset of the overlay hunk where the symbol resides in. This is really the plain offset of the HUNK_HEADER of this module to the start of the file. WARNING: These longs are the reason why overlays can't be crunched without further care! If you change the size of the overlay manager or the size of the root nodes, these longs MUST be adjusted. However, since custom overlays are possible, KEEP HANDS OFF! RESERVED(2): The next two longs in this structure are reserved, for whatever. Keep them zero. OVERLAY_LEVEL: Next is the OVERLAY_LEVEL. This is the depth of the overlay module the symbol is contained in, counting from zero for the root downwards the tree. Thus, for the SetVNC example, a reference to the job control module will have a one here, since it is a direct child of the root. ORDINATE: We continue with the OVERLAY ORDINATE, which is the "horizontal" position of a overlay node within the tree. The overlay ordinates are setup by the linker and are unique in each horizontal layer = level in the overlay tree. They usually count from one, but that's not required. INITIAL_HUNK: The next long is the initial hunk number of this overlay node and describes where in the hunk table (see below) the hunk pointers should go. If, for example, the root nodes of the tree plus the overlay manager take up four hunks, counting 0 to 3, and this is a direct child of the root, the initial hunk for loading will be four. If this module has three hunks, the initial hunk of one of the childs of this hunk will be seven, and so on. Strictly speaking, this long word is actually not used for loading a node, but for removing it from the hunk table. SYMBOL_HUNK: We come to the next long, the symbol hunk. This is the actual hunk number this overlay reference is relative to. If you call a subroutine of an overlay node, a reference for this procedure will be generated here, and this longword will contain the hunk number where the symbol is kept in the overlay node. In the example above, if the symbol is in the first hunk of the overlay, this longword will be again four, since the first overlayed hunk in the first module is again number four in the total file. If you refer to a symbol in the second hunk of an overlay module, this number will be the initial load hunk plus one, and so on. SYMBOL_OFFEST: The last longword is just the offset of the symbol within the hunk specified above, not conting the initial BPTR linkage of each segment. This value plus four will be added to the hunk base address to get the pointer to the routine you like to call. Here again the overview: FILE_POSITION Offset in the executable of the HUNK_HEADER of the overlay module. RESERVED (2) Two empty longs OVERLAY_LEVEL The level within the overlay tree, zero is the root. ORDINATE The horizontal position in a level of the tree. Only used as an unique ID INITIAL_HUNK First hunk number of the overlayed module. SYMBOL_HUNK The symbol hunk where this reference is pointing to. SYMBOL_OFFSET The offset of the symbol in the hunk, counting the first real long word of code as offset zero. The next data in an overlay file are again executable modules like the first one, again starting with HUNK_HEADER. However, this time the FIRST_HUNK longword of the headers are not zero. They are identical to the INITIAL_HUNK longs in the overlay table, described above. They contain again the hunk number where this module should be placed. The LAST_HUNK is no longer the size of hunks of the module minus one, but the INITIAL_HUNK + the size - 1, thus (guess what), the number of the last hunk in this module. Next are the sizes of these hunks, as for a usual HUNK_HEADER. What follows is again HUNK_CODE's, HUNK_DATA's and all the usual stuff to specify data for the overlayed module. This module is terminated by a HUNK_BREAK $000003f6, which tells the overlay manager to stop loading here. WARNING: As a result, an overlayed file never ends with HUNK_END $00003f2, but with HUNK_BREAK $000003f6! _____________________________________________________________________________ iv) How overlayed files look in memory: The overlay manager. Now lets come to the overlay manager, which is executed first when an overlay program gets started. As I said above, the OS expects the Magic Word $0000abcd at offset eight of this hunk, and fills the next four longs with vital data. They are valid as soon as your code gets started, and are no longer zero. Here how they look like: STREAM: The first LW is a standard DOS stream, which is used to operate on the file. Unlike regular executables, the stream REMAINS OPEN until the program quits. You don't need to close it, this is done by UnloadSeg(). OVERLAYTAB: The next long is a usual pointer to the overlay table, the contents of the HUNK_OVERLAY described above. It points to the TREE_SIZE long in this table (not to the overlay byte size, as for all AllocVec'd memory.) HUNKBPTR: The third long is a BCPL BPTR to the overlay hunk table. It is an BCPL array, which is as long in as the maximal number of hunks in the load file plus one. Thus, we have here TABLE_SIZE+1 longs, as specified in the HUNK_HEADER. Each entry in this hunk is a BCPL pointer to the hunk loaded at this position, or zero if this one is empty. The last entry of the array is always zero and serves as an terminator for the overlay manager. (This array is again allocated in AllocVec style, so the longword before the actual array contains its byte size, including the byte size field itself. This size tag is not considered to be part of the array.) GLOBALVEC: The last longword is a BPTR to the dos global vector. This vector is no longer used for its original purpose, and is a BCPL style analogon of the dos.library. It was used by BCPL overlay managers to do the loading cause exec style library calls are not possible for BCPL. The UnloadSeg() routine compares this longword with the global vector of the DOS, as a part of a sanity check. If the longword is identical to the DOS GlobVec, and the Magic Longword $0000abcd is present, the segment is considered to be an overlayed file. BTW, this global vector contained all the useful argument parsing functions SINCE EVER, EVEN IN KickStart 32.0. They have been made available to the public starting with 2.0, though. Again, a short overview: FILEHANDLE Dos stream (BCPL pointer) to this file. OVERLAYTAB C-pointer to overlay table in HUNK_TABLE. HUNKBPTR BCPL-pointer to the hunk array. GLOBVEC BCPL-pointer to dos global vector. _____________________________________________________________________________ v) How do overlay references look like? Provided one of your program segments contains a call to a child segment that needs to be loaded, the following code will be generated by the linker: The subroutine call is redirected to a call of the overlay manager like this: ParentRoutine: ... bsr Child ; <-- the call to an overlay node ... First, if this is a relative jump, the linker generates an ALV for you (an "absolute linker value"). This is sometimes unnecessary, however: Child: JMP __OV_Child ; even if __OV_Child is in the same hunk ; and in jump range. linker bug? This ALV jumps now to the actual overlay manager call: __OCV_Child: jsr @ovlyMgr dc.w __OV_Child_Index The overlay manager uses the return address by the above jsr call to find the index position in the overlay table, and replaces the return address of the call by the entry point of the overlayed function - which is hence entered as soon as the overlay manager quits with an rts. Remark: All this goes only for the "new style" overlay manager with fits well to the "registerized parameters" model. The code generated for the old style overlay manager looks somewhat different: __OCV_Child: move.w #__OV_Child_Index,d0 ;trash register d0 with the index jsr _ovlyMgr ;call the overlay manager Additional remark: The overlay call vectors and ALV are automatically created by the SLink linker, opposed to the (more than obsolete) ALink. It's behind the scope of this doc to describe the techniques needed for ALink. Besides, all label names are made up by the author... Possible optimizations: - Once an overlay call has been resolved, the overlay call vector could be patched to jump directly to the destination routine in- stead of going thru the overlay manager. That would speed up code. One must carefully replace the overlay manager calls as soon as the appropriate segments get unloaded. - A clever linker should avoid the generation of the additional ALV JMPs. - A clever overlay manager should be able to root other than direct calls to children. Additional information might be required. - An improved overlay manager should be able to hold more nodes than just one line of the tree in memory - propably utilising a Use-Count field for the overlayed functions to unload the nodes. _____________________________________________________________________________ v) What does the overlay manager now do? First, I'm describing here the new style overlay manager. The other (older) one is very similar except that the overlay index is passed in register #d0 instead of being placed behind the overlay manager call. This conflicts obviously with "registerized parameters" and is not very well suited for most modern compilers or assembly langauge. Try to avoid it! The (antique) CBM ALink linker is not able to create these new style overlay calls at all, the SAS SLink (former BLink, but not the 6.7 PD version of the Software Distillery) must be used. 1) Find out the return address of the overlay call. The index of the overlay reference within the OVERLAYTAB is kept at this address, as it has been placed there by the linker. 2) Get the location of the overlay table entry by this offset. 3) Get the location of the TREEPTRs within the overlay table. They keep the ordinate number of the module currently loaded at this position of the tree. 4) Compare if the overlay node currently loaded does match the tree that was requested. If so, calculate the start address within this module and start the overlay procedure. 5) If the ordinate number does not match, it's either empty (no over- lay module loaded at this position), or it is just the wrong child that is currently loaded. In this case: Unload the node plus all children, and unlink the modules, clear the references in the TREEPTRs array and in the hunk array. 6) Find out the location of the module to load. This is kept in the overlay table. Set the file pointer of the executable file to this position. 8) Load the overlay module. This is done with a very special call to LoadSeg, which is nowhere documented. Unlike usual, the file name is NULL, but the hunk table BPTR is passed in d2 and the filehandle in d3. This is really a bit tricky... (And this part of the code is broken in the old style overlay manager). 9) Link this module to the already loaded hunks. 10) Calculate the hunk and the position of the overlay reference from the table. Start the module at this position. _____________________________________________________________________________ For experts, I include the documented source of my overlay manager: ;************************************************* ;** SetVNC ** ;** Ein Bildschirmhandler ** ;** Der Voreinsteller ** ;** © 1990-98 THOR ** ;** ** ;** Block: Overlay-Manager ** ;** ** ;** ** ;** Version 3.02 ** ;** 05.07.1998 ** ;************************************************* include INC:macros.asm include INC:exec_lib.asm include INC:dos_lib.asm ;** Based on the Overlay-Manager by Richard Evans ;** and Tim King ;** Bug corrections by THOR 16.07.1996 ;** additional fixes by THOR 05.07.1998 xdef @ovlyMgr ;************************************************* ;** Offsets in the overlay-table ** ;************************************************* rsreset ot_FilePosition: rs.l 1 ;File position ot_reserved: rs.l 2 ;for whatever ot_OverlayLevel: rs.l 1 ;Overlay-Level ot_Ordinate: rs.l 1 ;Overlay-Ordinate ot_InitialHunk: rs.l 1 ;Initial hunk for loading ot_SymbolHunk: rs.l 1 ;Hunk containing symbol ot_SymbolOffset: rs.l 1 ;Offset of symbol ot_len: rs.b 0 ;************************************************* ;** Other stuff ** ;************************************************* MajikLibWord = 23456 ;Hey, don't know what this means section NTRYHUNK,CODE ;************************************************* ;** Manager starts here ** ;************************************************* Start: bra.w NextModule ;Jump to the next segment... ;* This next word serves to identify the overlay ;* supervisor to 'unloader'. dc.l $ABCD ;Majik longword for unloader (has to free hunk table, file handles,...) ;* The next four LWs are filled by the loader ol_FileHandle: dc.l 0 ;Overlay file handle (points to me) ol_OverlayTab: dc.l 0 ;Overlay table as found in the overlay hunk ol_HunkTable: dc.l 0 ;BPTR to Overlay hunk table ol_GlobVec: dc.l 0 ;BPTR to global vector (what for ?) dc.l MajikLibWord ;Majik library word as identifier dc.b 7,"Overlay" ;Majik identifier ol_SysBase: dc.l 0 ;additional pointer ol_DOSBase: dc.l 0 ;to libraries dc.b "THOR Overlay Mananger 1.1",0 @ovlyMgr: ;Entry-points saveregs d0-d3/a0-a4/a6 ;Saveback register moveq #0,d0 move.l 10*4(a7),a0 move.l ol_OverlayTab(pc),a3 ;get pointer to overlay table move.w (a0),d0 ;aus der Returnaddresse die Nummer holen move.l a3,a4 ;do a4 ;Upper word MUST zero (check this out in BLink) add.l (a3),d0 ;add length lsl.l #2,d0 ;get offset add.l d0,a3 ;Address of overlay entry move.l ot_OverlayLevel(a3),d0 ;get overlay lsl.l #2,d0 adda.l d0,a4 move.l ot_Ordinate(a3),d0 ;get required ordinate level cmp.l (a4),d0 ;compare with current ordinate level beq.s .gotsegment ;not correct level ;clear all other entries behind this move.l d0,(a4)+ ;fill with new overlay entries moveq #0,d1 do tst.l (a4) ;terminate, if end of table found break.s eq move.l d1,(a4)+ ;clear this loop.s move.l ot_InitialHunk(a3),d0 ;First hunk number to load add.l ol_HunkTable(pc),d0 ;Plus BPTR of hunk table lsl.l #2,d0 ;Address of entry in hunk table move.l d0,a4 move.l -4(a4),d0 ;get previous hunk: MUST be present (do not load children before master) beq .noprevious lsl.l #2,d0 move.l d0,a2 move.l d1,(a2) ;unlink fields before loading ;now free all hunks move.l ol_SysBase(pc),a6 do move.l (a4)+,d0 ;next hunk ? (terminated by zero) break.s eq lsl.l #2,d0 move.l d0,a1 ;->a1 cmp.w #36,20(a6) blo.s .freemem jsr FreeVec(a6) ;use FreeVec for newer releases reloop.s ;since the DOS used AllocVec .freemem: ;for allocation. move.l -(a1),d0 ;get length jsr FreeMem(a6) ;free this hunk (don't check if valid) loop.s ;and now the next .retry: move.l ol_DOSBase(pc),a6 move.l ol_FileHandle(pc),d1 ;Get our stream move.l ot_FilePosition(a3),d2 ;get file position moveq #-1,d3 ;relative to beginning of file jsr Seek(a6) ;seek to this position tst.l d0 ;found something ? bmi.s .loaderror ;what to do on failure ? ;now call the loader move.l ol_HunkTable(pc),d2 ;Hunk table moveq #0,d1 ;no file (is overlay) move.l ol_FileHandle(pc),d3 ;Filehandle jsr LoadSeg(a6) ;load this stuff tst.l d0 ;found beq.s .loaderror move.l d0,(a2) ;add new chain (note that we removed that old bug, value is in d0) ;found this stuff .gotsegment: move.l ot_SymbolHunk(a3),d0 ;get hunk # containing symbol add.l ol_HunkTable(pc),d0 lsl.l #2,d0 ;get APTR to hunk move.l d0,a4 move.l (a4),d0 ;BPTR to hunk lsl.l #2,d0 add.l ot_SymbolOffset(a3),d0 ;Offset move.l d0,10*4(a7) ;Set RETURN-Address loadregs rts ;************************************************* ;** Go here, if we find an error ** ;************************************************* .loaderror: saveregs d7/a5 move.l ol_SysBase(pc),a6 move.l #$0700000C,d7 move.l $114(a6),a5 jsr Alert(a6) ;Post alert loadregs bra.s .retry ;Retry or die .noprevious: move.l ol_SysBase(pc),a6 move.l #$8700000C,d7 ;dead end ! move.l $114(a6),a5 jsr Alert(a6) ;Post alert bra.s .noprevious ;************************************************* ;** NextModule ** ;** Open stuff absolutely necessary and ** ;** continue with main program code ** ;************************************************* NextModule: ;why save registers ? move.l a0,a2 move.l d0,d2 ;keep arguments (BCPL stuff is not kept) lea ol_SysBase(pc),a3 move.l ExecBase,a6 lea DOSName(pc),a1 ;get name of DOS move.l a6,(a3) ;fill in Sysbase moveq #33,d0 ;at least 1.2 MUST be used (no support of antique stuff) jsr OpenLibrary(a6) move.l d0,4(a3) ;Save back DOS base for loader (behind) beq.s .nodosexit ;exit if no DOS here move.l d0,a6 ;as a service, post this to the main code move.l Start-4(pc),a0 ;Get BPTR of next hunk (will be first hunk in the system) adda.l a0,a0 move.l d2,d0 ;restore argument adda.l a0,a0 exg.l a0,a2 ;move to a2 jsr 4(a2) ;jump in ;Gee, I hope the main is clever enough to return here... move.l d0,d2 ;Save return code move.l ol_SysBase(pc),a6 move.l ol_DOSBase(pc),a1 jsr CloseLibrary(a6) ;Close the lib move.l d2,d0 ;Returncode in d0 rts .nodosexit: move.l #$07038007,d7 ;DOS didn't open (this code is supposed to be part of the DOS) move.l $114(a6),a5 jsr Alert(a6) moveq #30,d0 ;Something went really wrong ! rts DOSName: dc.b "dos.library",0 _____________________________________________________________________________ References: [1] : The AmigaDOS Manual (3rd ed.) Bantam Computer Books [2] : The Amiga Guru Book (R. Babel, Taunusstein 1993) _____________________________________________________________________________ So, this is all folks. Hope you liked the show.... Thomas July 1998 _____________________________________________________________________________ The THOR-Software Licence (v2, 24th June 1998) This License applies to the documentation known as "Overlay.doc". The "Program", below, refers to this data. The "Archive" refers to the package of distribution, as prepared by the author of the Program, Thomas Richter. Each licensee is addressed as "you". The Program and the data in the archive are freely distributable under the restrictions stated below, but are also Copyright (c) Thomas Richter. Distribution of the Program, the Archive and the data in the Archive by a commercial organization without written permission from the author to any third party is prohibited if any payment is made in connection with such distribution, whether directly (as in payment for a copy of the Program) or indirectly (as in payment for some service related to the Program, or payment for some product or service that includes a copy of the Program "without charge"; these are only examples, and not an exhaustive enumeration of prohibited activities). However, the following methods of distribution involving payment shall not in and of themselves be a violation of this restriction: (i) Posting the Program on a public access information storage and retrieval service for which a fee is received for retrieving information (such as an on-line service), provided that the fee is not content-dependent (i.e., the fee would be the same for retrieving the same volume of information consisting of random data). (ii) Distributing the Program on a CD-ROM, provided that a) the Archive is reproduced entirely and verbatim on such CD-ROM, including especially this licence agreement; b) the CD-ROM is made available to the public for a nominal fee only, c) a copy of the CD is made available to the author for free except for shipment costs, and d) provided further that all information on such CD-ROM is redistributable for non-commercial purposes without charge. Redistribution of a modified version of the Archive, the Program or the contents of the Archive is prohibited in any way, by any organization, regardless whether commercial or non-commercial. Everything must be kept together, in original and unmodified form. Limitations. THE PROGRAM IS PROVIDED TO YOU "AS IS", WITHOUT WARRANTY. THERE IS NO WARRANTY FOR THE PROGRAM, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. IF YOU DO NOT ACCEPT THIS LICENCE, YOU MUST DELETE THE PROGRAM, THE ARCHIVE AND ALL DATA OF THIS ARCHIVE FROM YOUR STORAGE SYSTEM. YOU ACCEPT THIS LICENCE BY USING OR REDISTRIBUTING THE PROGRAM. Thomas Richter