🏠

The Semware Editor

Possible Future Projects

In no particular order

UniConv

This is a possible future project.

The Unicode extension already lets us convert ASCII, ANSI, and Unicode to each other interactively, but sometimes I need a non-interactive version.

UniConv would non-interactively convert a file to a file or a buffer to a buffer.

A file would be treated as binary input.
For a buffer it would depend on TSE's binary status for the buffer.

Without a shred of evidence I have a tiny hope that a file-based conversion could be faster. That would be icing on the cake.

English

This is a possible future project.

I have come to find Sammy's TSE English syntax hiliting file a delight for catching spelling errors in .txt files.

I now would love to have a TSE extension with these additional features:

  • A cleaned-up syntax hiliting file to start from.
    The current file is pleasantly complete, but also contains a lot of weird words.
  • Better hiliting of numbers and punctuation marks.
  • An easy way to add a not-yet-hilited word from a text to the syntax hiliting.
  • A way to also apply English hiliting to comments in programming languages.
    Mixed syntax hiliting will be a hard problem, which might require a separate future project and extension.

20 Oct 2023

DirList β†’ Test β†’ DirList β†’ Unibrowse β†’ Remote editing

One thing is leading to another, again.

Initially I just wanted to implement remote editing for the HTTP(S) and SSH protocols.

However, remote filenames can contain Unicode characters too, and TSE does not natively support them. So I researched if it would make sense to make the new functionality part of the existing UniBrowse extension, which for one handles local filenames that possibly contain Unicode characters. My conclusion is that it does. The Unibrowse extension also implements browsing directories and files that are extended with alternate data streams. In examining the UniBrowse extension I was confronted with my earlier design choice to implement listing and accessing "Unicode names" and alternate data streams with Dos commands instead of Windows APIs. Some googling to revisit that decision this time turned up Windows APIs that do not seem too hard. Using Windows APIs would speed up Unibrowse immensly and make it a lot more robust. Before committing to a big tool like UniBrowse, it makes sense to first try these new APIs in a limited tool like DirList. And before that an even more limited test macro to just test if using these APIs from TSE actually works.

So here I currently am, updating DirList, to maybe later write a test macro testing Windows file APIs, to maybe later rewrite DirList to use them, to maybe later rewrite UniBrowse to use them, and to maybe eventually extend UniBrowse with remote editing capabilities.

So basically I am contemplating rewriting another core part of the editor using its macro language, this time its file browsing. I am aware of the absurdity.

ViewFindsHi

This is a possible future project.

Syntax highlight the View Finds list.

A TSE extension should be able to do this. The biggest hurdle would be multi-line comments. Which is unfortunate, because hiliting comments and not-comments as such would also be the extension's greatest benefit!

Base64

This is a possible future project.

I needed and created a quick & dirty version of a tool that can convert a base64-encoded text to readable text.

It "works", but is currently so user-unfriendly, that I do not consider it publishable. To do.

UniDraw

This is a possible future project.

Linux TSE and Windows GUI TSE gave us the ANSI character set, which gained us language compatipilty between at least 14 languages, but lost us line drawing capability.

Unicode supports line drawing, and the Unicode and Uniview extensions give TSE Unicode compatibility, but there is no tool yet that uses the Unicode character set for line drawing.

UniDraw could be that tool.

UniClip

This is a possible future project.

Copying and pasting Unicode text from and to the Windows clipboard is already possible with the Unicode extension. However, this functionality needs its own extension, provisionally called UniWinClip or UniClip. The main reason is that the Unicode extension is currently too big for debugging.

A secondary goal would be to add the capability to paste the Windows clipboard as a column block. There are two known ways to do this. One is by giving the tool a parameter to do so. The other is by doing a column block paste from the Windows clipboard if the text contains a marked column block, because this is what Visual Studio and VSCode do. The second way might need to be optional.

A tertiary goal would be to reproduce TSE's capability to reliably copy column blocks between TSE sessions.

UniView v2

This is a possible future project.

Two major improvements might be possible:

  • It might be possible to get rid of UniView's flickering line updates by using TSE's HookDisplay() command.
  • It should mostly be possible to make UniView work for the current line too.
    However, this implies taking over the backspace, delete, tab and horizontal cursor keys to compensate for one displayed Unicode character being implemented by multiple underlying bytes.
    I am not going to implement the tab key at first, and maybe not at all. It will be really hard to implement given all its configuration options and macro implementations.

These two improvements would again bring TSE major steps closer to being a full Unicode editor.

I do not plan to solve these issues:

  • Misaligned tab positions after a non-ANSI character.
  • A column block that only partially contains a non-ANSI character.
  • Sorting on non-ANSI characters.

UniCode Suite

This is a possible future project.

There already is a group of separately distributed extensions to make TSE more compatible with Unicode.

It would be nice to be able to distribute these as a coherent bundle.

(I have renamed the "Edit2" tool to "UniBrowse" in anticipation of this project.)

SmoothScroll

This is a possible future project.

In creating the Uniview extension I noticed how the parameters of the Windows TextOut API might be used to implement smooth scrolling in TSE.

One day I would like to try if this works, just for the fun of it.

WordWrap
No versioning yet
23 Jun 2024
TSE v4 upwards

This considered extension will replace, change and extend TSE's WordWrap functionality.




Summary of main differences with TSE's wordwrapping:

  • A paragraph can also be indicated by an indented first line, a bullet point, an enumerator, a multi-line comment, a block of consecutive single-line comments, and HTML tags.
  • Wordwrap scope and state is maintaned per buffer.
  • Default wordwrapping is OFF.
  • When turned ON it keeps wrapping the current paragraph, and turns OFF again when the cursor leaves it.
  • A broader scope can be specified by the user, but then other paragraphs in the same buffer only start being wordwrapped when the user changes them.
  • Right margin "0" means "the window width".




The following more detaild points are considered:

  • The type of wordwrapping will depend on whether the type of the current buffer is a "text buffer" or a "program/data buffer".
  • Here a "text buffer" means a buffer with an extension for which no comment-syntaxhiliting is defined, and all other buffers are of type "program/data buffer".
  • For a "program/data buffer":
    • Wordwrapping is default off.
    • The user can toggle wordwrapping between these two values:
      • Off: No wordwrapping is done.
      • On: If the cursor is in a wordwrappable block, then the block is wordwrapped and keeps being wordwrapped until the cursor leaves the block.
    • A wordwrappable block is a multi-line comment, a block of consecutive single-line comments, or text between HTML tags.
    • Instead of "On" the extension could show the type of wordwrappable block it recognizes.
  • For a "text buffer":
    • Wordwrapping is default off, unless persistence "File" applies.
    • The user can cycle wordwrapping through these persistences:
      • None: No wordwrapping is done.
      • Paragraph: The current paragraph is wordwrapped and keeps being wordwrapped until the cursor leaves the paragraph.
      • Buffer: As "Paragraph" plus the same for other paragraphs in this buffer. For switched-to paragraphs wordwrapping starts when the user makes a change.
      • File: As "Buffer" plus for this file it persists across TSE sessions.
    • Wordwrapping will honor TSE's left and right margin settings.
      When the right margin is 0, wordwrapping will use the width of the editing window.
  • A paragraph is a piece of text in a "text buffer" delimited by:
    • An empty line.
    • An indented line at the start of the paragraph.
    • The start of a bullet point.
    • The end of a bullet point.
  • Bullet points can have levels: A bullet point can contain bullet points.
  • Wordwrapping "text between HTML tags" will only be done for text between tags that contains no other tags, other than a few excepted tags.
    Such excepted tags: <strong>, <em>, <br>.
    <br> will act as a wordwrap delimiter.
    Text between <pre> tags will not be wordwrapped.
  • The extension will show TSE's "W" indicator in the status bar when a block or paragraph is actively being wordwrapped.
  • Probably: Recognize other tag extensions than ".html" and ".htm".
    It is not uncommon to encounter .xml files where the whole file is in 1 line. While the xml structure would need to be reformatted by another program, it might become this program's task to wrap the lines inside tags.
  • Maybe: If the "Status" extension is installed, then it could (optionally) be made to show which type of wordwrapping is active.
  • Maybe: Change the background color for an actively being wordwrapped paragraph or block.




Notes:

  • Semware and I mainly use a "-" as a bullet point.
    Other seen bullet points are "*", "+" and "ΓΊ".
    The "ΓΊ" is shown as a bullet ("βˆ™") in some OEM character sets. TSE's Help is shown using the system's OEM character set.
    In Windows GUI TSE's ANSI character set, characters 149 ("β€’") and 176 ("Β°") could be bullet characters too.
  • I use bullet points inside comments, so it makes sense to me that there they should be wordwrapped too.
  • The text of the "Expr" macro has two different occurrences of when to ignore "bullet points".
  • Examples of how to wordwrap bullet points.
    Not like this:
      Groceries: - The red laundry detergent
    but like this:
      Groceries:
      - The red laundry detergent
    And not like this:
      Groceries:
      - The red laundry
      detergent
    but like this:
      Groceries:
      - The red laundry
        detergent
    (Are my clothes real? Do I want to find out and take the red laundry detergent, or belay my suspicions and take the blue one?)
  • I would like to recognize and support enumerators.
    For example "1", "1.1", "a.", "a}", "(1)", etc.
    But probably not ones that use roman numerals.
    Note that enumerators either have their own line or prefix the first line of a paragraph.
    Enumerators impact wordwrapping in 2 ways:
    An enumerator is a paragraph delimiter.
    If an enumerator prefixes the first line of a paragraph, then it counts as whitespace regarding its paragraph's indentation.
  • Importing and exporting wrapped text.
    Text is wrapped per paragraph.
    Paragraph recognition differs between TSE and Word, for example.
    In TSE a paragraph is recognized by a blank line or by an indented or outdented first line. This is an existing setting.
    In Word a paragraph is recognized by "the enter key".
    If you copy a text from Word to TSE, then each Word paragraph becomes a single line.
    This implies, that also functions are required to wrap/unwrap a buffer or block of text that was/will be imported/exported from/to some other applications.


These webpages are created and maintained with The SemWare Editor Professional