Kryptostack
|
KryptoStack is licensed under the GNU General Public License v3.0.
See the LICENSE file for more details.
A minimalist stack-based programming language designed as a flexible
framework for implementing and exploring cryptologic algorithms.
Its syntax and semantics are inspired by PostScript and Forth, offering
a concise and expressive environment for learning and experimentation.
KryptoStack is an open-source project.
The following interests could be addressed by the project:
The project and the code are written in English.
For mathematical operators, whose creation is a primary goal of the project,
multilingualism is enabled.
Readability and clarity of the program code take precedence over optimization
in the implementation.
All artifacts can be built with a minimal toolset. Currently, these are
the GNU C++ compiler, make, flex, Bash, Boost and git. The project is hosted
on Gitlab, and the environment's features like CI/CD are used. However, a
build of all artifacts and tests must always be possible without Gitlab.
Documentation is preferably embedded in the code and should be cleanly
extractable with Doxygen. All documentation is included in the git repository.
The main branch of the code must always allow a build of the artifacts
at the turn of the day, and these must pass the automated tests.
There is a constant refactoring of the code and documentation.
There is an extensive catalog of automated tests. When making design
decisions, the testability of a possible solution is given high priority.
Install, if not already on your system, the GNU C++ compiler, make, flex,
the quadmath library, the Boost library and git. For the documentation
open the GitLab Pages of the project. There you can find the latest
project and user information.
To build the artifacts and run all tests:
make e2e
Optionally install Doxygen and LaTeX to generate the HTML documentation
in ./public/
with:
make public
To run an example:
make ks; make parser; ./kb Examples/Factorial.ks
To run all tests:
make e2e
To run all examples:
make examples
Use it as a calculator:
kc 25346 95095 gcd
kc 1 2 1 33 { mul } for
This project leverages the following core technologies:
Optional tools:
Tools from the pipeline:
There are three build types for different software processes
and environments: DEVELOP PRODUCTION PROFILE
If a KryptoStack operator is coded directly in C++, it is referred to as core code.
The second interpreter pass reads and interprets its KSN-Format input.
The file format for the stack-based programming language KryptoStack.
Each line of code can contain commands, which are short, case-sensitive
keywords, and operands, which are pushed onto a data stack and
manipulated by operators. The language supports various data types,
including integers, real numbers, booleans, strings, arrays, and
dictionaries, and follows a reverse Polish notation (RPN) syntax, where
operators follow their operands.
The exchange format between the first interpreter pass called "parser"
und the second interpreter pass called "ks".
A subspecies of semantic objects, that operates on other semantic
objects, the stacks or the interpreter itself. Some of the operators are
procedures and the others are core code.
A one character code for each SO class. The virtual ot() member function
of all SO classes return this code and allows explicit type-based decisions
beside the C++ language poylmorphisms.
See: object types
The first interpreter pass checks the syntax and converts the programm
into KSN-Format.
The procedure level counts the nesting level of curly braces in the
program code.
For the classes Interpreter and Context there are attributes, together with
Setters and Getters, which change the behaviour. These properties can be set
with KryptoStack language operators.
Forth calls them words.
We use IDEA:
, TODO:
and FIXME:
as Task Tags
All tools support a -v
option to generate more run time information.
A subdirectory as container of ks-Files which creates one or more
functional related dictionaries if executed.
Reads standard input stream, runs checks and normailzes it into the
KSN-Format on standard out.
Ignores the remainder of lines starting with %.
Halts execution upon error detection.
Normalizes I tokens by stripping leading plus signs.
Removes the opening and closing parentheses from the string.
Removes all single backslashes from the string, preserving double
backslashes (\\
) and newline sequences (\n
). These are the only
escape sequences recognized in the KSN format.
The -v option activates verbose mode.
The -h option prints a command line help.
This is a simple line-oriented format.
Each line begins with a KSNCode followed by a colon (:
).
The code identifies the data type of the subsequent value.
The differnt KSNCodes and data types are:
>I: A signed integer
>R: A floating-point number
>B: A boolean literal
>S: An unquoted string with just two escape sequences \\
and \n
>N: A literal name token consisting of any characters but whitespaces
>X: An executable name token consisting of any characters but whitespaces
>#: The value is a comment
>E: An error message
This is a list of SO classes and their OTCode.
All objects are executable or non-executable aka literal.
This executabe status becomes manifest with different OTCodes for
name objects and array objects.
A duplicate of an object of these types duplicates the value of the object.
SOL - 0
The null object is used as placeholder in arrays.
SOB - B
A boolean value.
SOI - I
A 128-bit integer.
SOM - M
A mark object.
SON - N or n
N ... for non-executable name objects
n ... for executable name objecs
SOO - O
A regular registered operator.
SOo - o
An unregistered operator.
SOR - R
A real number.
A duplicate of an object of these types shares its value with the
original object.
SOA - A and a
A ... for a non-executable array
a ... for an executable array aka procedure
SOD - D
A dictionary is a list of key-value pairs of SO's.
SOS - S
A string.
Extension | Description |
---|---|
ks | KryptoStack language source |
cpp, h | C++ source |
md | Markdown documentation |
l, c | flex source and generated code |
yml | YAML data |
json | JSON data and configuration |
sh, inc | Bash script |
none | executables and some Bash scripts |
tpl | Bash template |
cloc | CLOC configuration |
kdev4 | Kdevelop configuration |
gcno | profiling structure information |
gcda | profiling run data |
html, css | HTML source |
info | lcov data |
out | end-2-end test reference output |
Extra | Description |
---|---|
.gitignore | git ignore |
LICENSE | License ASCII text |
Doxyfile | Doxygen configuration |
compile_flags.txt | clang configuration |
./Coverage
Test coverage data.
./Examples
Example programs.
./.git
That's the git Repository.
./.kdev4
Is a living propsal for a KDevelop configuration.
./public
Location for Gitlab Pages web server files.
The Doxygen-generated HTML documentation can be found in ./public/doxygen/
./Suite*
Each 'Suite' directory contains a collection of test cases.
./tmp
Exists for temporary files e.g. for testing purposes.
./Tools
Contains TOOLS for the development, e.g. git hooks.
./vocabularies
Is only a container for its subdirectories.
These subdirecories are called vocabularies and the names of the
subdirectories are used to identify the vocabularies.
./.vscode
Is a living propsal for a Visual Studio Code configuration.
See EXAMPLES
See LANGUAGE
See TESTS
See TOOLS.
There are three so-called build types. They denote build and, in particular,
compilation settings for different environments.
DEVELOP
Compilation without time-consuming optimizations.
Used to develop and debug the code.
PRODUCTION
Compilation with good optimization. All debug-code is removed.
All asserts are removed.
Used for the GitLab pipeline.
PROFILE
Compilation without any optimization. No inlineing of functions.
Compilation with instrumented code to generate profiling data.
Used to analyze function, line and branch test coverage.
The make utility can be directed by: BUILD_TYPE=PRODUCTION make ks
or
as in the GitLab pipeline configuration by: make BUILD_TYPE=PRODUCTION ks
.
The command line tools kb, ks, and parser print the build type with which
they were generated as the first line in their help text.
We use a two-part Version Number consisting of a major and minor
version, separated by a period. We started with "0.9".
Release Numbers are assigned to these version numbers by appending
a sequential number starting from 1. Therefore, the first release number
was 0.9.1. When the version number is increased, the last number is reset
to 1. Consequently, the first release of version 1.0 is 1.0.1.
Release numbers are stored in git as Release Tags prefixed with 'v'.
Thus, the first release tag is "v0.9.1". In git, the first commit intended
for a release is tagged with this release tag.
The last commit in the release cycle receives a Release Completion Tag
in git. This release completion tag in git has the form of the release tag
with the appended text "--release". Therefore, the first release
completion tag is named "v0.9.1--release".
For each release completion tag, a Gitlab Release Object, i.e., a
release in the sense of Gitlab, is created.
When a new release is created, a new section is opened in the
CHANGELOG.md for this release. The Release Date of the previous
release is also assigned then. This is the date the respective Git
release object was created.
The section of the previous release in the CHANGELOG is manually revised
when a new release is created. The revised content of the CHANGELOG section
for a release serves as the Release Notes.
For each commit, the corresponding commit message should be appended to
the CHANGELOG. This can be done automatically by installing a git hook
from TOOLS.
A release consists of:
git push
git push origin tag vx.y.z--release
git push
git push origin tag vx.y.z+1
make clean
make
release-info outputmake e2e
./ks -h
release-info outputThe first character of a class name is an uppercase letter.
Class attributes end with an underscore.
Parameters are consistently prefixed with p_
.
Functions within anonymous namespaces are consistently prefixed with s_
.
Test cases names are prefixed with B_
.
Test suite names are prefixed with Suite
.
using namespace xyz;
is not used at all.
An indentation consists of 2 spaces.
A maximum of 4 levels of indentation should be maintained (a never nesting approach).
We use the dollar sign character as part of identifiers.
The test coverage is determined using gcov. The latter requires
a clean separation of code lines for each statement to be measured.
prerequisites are:
The available makefile targets can be displayed by using the the
make
-command without options.
We use Ghostscript as our reference implementation for PostScript.
Hint: To invoke gs
without rendering, you can set the environment
variable export GS_DEVICE=nullpage
.
The makefile will be triggered from the Gitlab pipelines. All process
details are implemented in the makefile and its helping scripts.
The different images in use for the CI/CD stages and the various
package prerequsites are documented within the GitLab YAML file for CI/CD.
Main interpreter loop
SON::load_exec()
Processes one line of the KSN-format input.
{
, }
are pushed onto Look up a name and executes it.
Calls the C++ machine code associated with the SOO and SOo.
Unfolds duplicates of the array-content to the execution stack.
Replaces executable names with operator objects recursively into elements
that are SOA. Also does an optimization/compilation if Context::compile_
ist set to true.
These dictionary stack manipulations influence what is found by
SON::load_exec()
.
The exec operator pushes
SOA::unfold2exec()
andonto the execution stack.
The loop operators push
SOA::unfold2exec()
onto the execution stack.
The minumum C++ standard in use is C++ 20 with GNU extensions.
Remarkable C++ features in use:
static inline int
attributes within [[nodiscard]]
and [[noreturn]]
attributesstd::initializer_list<>
std::source_location
C++ features that we intentionally avoid:
Patterns and architecture:
Invariants, pre- and postconditions are documented inline with Doxygen.
Certain contracts can be fullfilled by strong typing:
const qualifier for call parameters IDEA: TBD
Classes with invariants have to inherit protected from class DbC
from dbc.h
.
A protected bool invariant() noexcept const
implements all invariant-checks
listed in the Doxygen @invariant
class documentation.
If a class has a parent class with an invariant()
, then it must call
this Parent::invariant()
.
#ifndef DBC_IS_VOID
guards the invariant checker and auxiliary code
to exclude them in BUILD_TYPE PRODUCTION.
DBC_INV_CTOR(classname);
will be used
DBC_INV;
will be used
DBC_INV_RAII(classname)
can used to force the call to invariant()
at function return.
It does the checks, but not in BUILD_TYPE production.
IDEA: no instantiation forced with non-public ctors, etc.
The preconditions are listed in the Doxygen @pre
member function
documentation. DBC_PRE
ist used for the implementation of the
precondition-checks.
The postconditions are listed in the Doxygen @post
member function documentation.
DBC_POST
ist used for the implementation of the postcondition-checks.
The installation of the boost library in a GitLab pipeline image requires a
apt -y install gfortran- libboost-all-dev
, because of Fortran config issues.
Following boost modules are in use:
BOOST_ASSERT, BOOST_ASSERT_MSG, BOOST_ASSERT_IS_VOID
are used as building blocks for
DBC_PRE, BDC_POST, DBC_INV, DBC_INV_CTRO, DBC_INV_RAII and DBC_IS_VOID
See TESTS
and code statistics
cppcheck ist used as integrated code tool in KDevelop and as target within
the makefile.
clang-tidy ist controlled by the makefile.
The output is send to tmp/statcode1.txt
Integration into Viusal Studio Code is TBD.
Integrates with KDevelop. detailed config TBD.
makefile integration TBD.
the tool cloc is integrated in the makefile. It generates CLOC,
which is part of the documentation.
shellcheck is a Unix shell lint. It is integrated with the component
mechanism in the GitLab pipeline.
Some checks by shellcheck are disabled generally within the GitLab
YAML configuration.
SC2086 is disabled with a comment convention within the shell scripts
with # shellcheck disable=SC2086#
.
markdownlint-cli2 is a Markdown lint. It is integrated with the
component mechanism in the GitLab pipeline. The check is configured
to always succeed.
Has build-in analysis for the editor.
KDevelop requires a compile_commands.json
file to run clang-tidy
and Clazy
.
bear
generates this file, with an integration in the makefile.
Has build-in analysis for the editor.
GitLab SAST is a GitLab-maintained pipeline feature for automatic security tests.
The task tags information is extracted using the Tools/tasktags.sh
script.
This script is directly integrated into the makefile, making it part of
the automated build process. Editors and IDEs support this kind of tagging
by highligthing the texts.
We utilize three distinct tags to categorize different types of tasks.
However, specific configuration might be required depending on the editor
or IDE used.
FIXME:
This is a known bug in the software. This bug should be fixed in the
next software release.
TODO:
A task that still needs to be completed. This task should be finished
before the next major software release.
IDEA:
This is an idea for a potential improvement to the software. Implementing
this idea is planned, but the exact timing is not yet determined.
The current list of task tags can be found under TASKTAGS
The following GitLab features are used:
Additional features will be integrated over time.
KDevelop creates a .kdev4
file and a ./kdev4
directory. These files
are integrated into the git repository to share them.
VSC creates a ./vscode
directory. This directory is integrated into
the git repository to share it.
The 'Todo Tree' extension can handle our task tags appropriately.
Commenting is done in Doxygen Javadoc style. This allows for loose
integration into KDevelop, which parses these comments.
JAVADOC_AUTOBRIEF ist enabled in Doxyfile
to eliminate explicit
@brief
tags.
The Doxygen output is located in the ./public/doxygen
directory.
The launch of Doxygen is integrated as a target within the makefile.
The Doxygen output is published as GitLab pages. The process is automated
using GitLab's CI/CD features.
We use the style /**
to start comments. Functions are documented at
their point of declaration. Inline LaTeX formulas are used to pimp up
the ouput.