| Current hard space summary |
Better markup for the hard space, using ,,
Summary
The hard space is an important but neglected element in good Wikipedia editing. It stops an unwanted line break, so it is also called no-break space, or non-breaking space. An example (one sort from very many): no line break should occur in "17 sq ft". At present there are two ways to achieve this: first, with the raw HTML code (17 sq ft); second, with the {{nowrap}} template ({{nowrap|17 sq ft}}). These options are hard to remember, hard to input, and hard to interpret on the screen. Some cases are far more complex.
The solution? Introduce simple new Wikipedia markup, similar to the existing markup for italic (''italic text'') and bold ('''bold text'''). Although these are converted by the system into HTML code (<i>italic text</i> and <b>bold text</b> respectively), the text always appears in the edit box with the markup '' or '''.
The proposal simply adapts this useful and accepted idea, to include the hard space. Extensive discussion among interested editors, followed by a poll, shows that ,, (two ordinary commas) is the best markup. When it is implemented, one could type 17,,sq,,ft in the edit box, which would be converted internally to 17 sq ft, so that the reader of the article always sees an unbroken "17 sq ft". Editing, we would always still see 17,,sq,,ft. This innovation is easy for experienced editors and welcoming to Wikipedia newcomers, since it is the same style as markup for bold and italics.
Analysis shows that comma-based markup could be extended for other formatting and punctuation; but that is beyond the present simple proposal.
|
| Current hard space technical details |
2. Technical details
The system's existing parsing of markup for italics ('') or bold (''') is a little complex. Those markups are "dual", requiring distinct interpretations as either beginnings (<i>, <b>) or ends (</i>, </b>). Those markups have to coexist with the use of ' as a single quote mark and as an apostrophe. Italics and bold are often applied together: sometimes overlapping, but more often nested like 'this' (markup: ''nested '''like 'this''''''). The WP community accepts the occasional ambiguity, and where the system fails to parse as an editor intends, there are workarounds available.
The proposed markup for the hard space will be much more straightforward – for originating editors, subsequent editors, and the system itself. There are no beginnings and ends, just single applications. Still, some slight complexities will arise, and they are easily dealt with. We start with the simplest case:
The case of ,,
- Exactly two adjacent commas will always be parsed as a hard space, yielding the HTML code
. Inadvertent typing of ,, instead of , will cause no serious damage: its effect can easily be detected and repaired.
The case of ,,,
- The markup ,, must coexist with the use of , as an ordinary comma in the text, though it will rarely be adjacent to such a comma. If this ever does happen, the natural parsing of ,,, would be as comma + . This could conceivably be needed in a complex subscript, like this: W1, 2, 3, ... , n (markup:
W<sub>1,,,2,,,3,,,...,,,n</sub>).
It is hard to think of a case in which the reverse would be needed ( + comma); but if that ever did arise, <nowiki></nowiki>, , and {{nowrap}} would always be available, as they are now.
The case of ,,,, (and higher even numbers of commas)
- Sometimes more than one hard space is called for: in fine-tuning spacing in tabular work, for example. The natural parsing of
,,,, would be as + . A similar interpretation would apply to any string of 2·n commas: it would be parsed as n instances of . And similarly for longer even-numbered strings of commas. If the editor intended comma + + comma, the existing alternative resources should be used.
The case of ,,,,, (and higher odd numbers of commas)
- The sequence
,,,,, would be most unlikely to occur, and might arise more often as an error than as meaning anything specific. But the natural parsing would be comma + + . And similarly for longer odd-numbered strings of commas. If the editor intended + comma + , the existing alternative resources should be used.
In short, there is a single rational parsing for any comma-based markup that could arise; and in the rare cases in which no comma-based markup will yield the desired non-breaking HTML code, alternative resources will meet the need as they do now.
|
| Current hard space objections and replies |
3. Objections and replies
Objection 1
Hard spaces? I never use them! Why should I care?
- Good editing requires hard spaces, even if many editors know nothing about them. Wikipedia's Manual of Style (MOS) explains some of their uses; and more uses would be added there, if only the markup were simple enough.
Objection 2
The no-break space can already be input with . Recently this code has been attached to a button for insertion, under the edit box. Isn't that enough?
- There are three points to make.
- First, the code
is hard to remember, hard to type, and hard to interpret on the screen, especially to those who are unfamiliar with HTML. One simple example: an en dash often needs a hard space before it. Consider "8 June – 13 July". To keep this from breaking improperly you need to type, or later edit, this code: 8 June – 13 July. Under our proposal, you would simply type this instead: 8,,June,,– 13,,July.
- Second, note that the goal of wikitext is to abstract HTML codes away from the user, especially for common markup such as
''italics'' and '''bold'''. If the hard space is to be used as often as MOS advocates, it also needs this treatment.
- Third, the new insert button does not help with interpreting the code seen in the edit box. It helps with inserting: but you still have to find the button, then find your place in the text again. Do you really want to do all that three times, for the example just given?
Objection 3
Why not use the Unicode character for the no-break space, attached to an insert button?
- There is still a problem with insert buttons as a solution. In contrast to
, the Unicode non-breaking space is visually indistinguishable from the ordinary space. This is unacceptable, since the editor needs to be able to see the difference in the edit box.
Objection 4
There is also a {{nowrap}} template. Why not use that?
- This template may be useful in some longer phrases that shouldn't be wrapped at any point, like {{nowrap|5 sq mi}}; but again it is too visually intrusive to be used extensively for single no-break spaces. With our example above, the code would be
{{nowrap|8 June }}– {{nowrap|13 July}}. And many cases are more complex than that.
Objection 5
I use a non-standard editor with aliasing, so when I want a hard space I type /h to get . Easy! So what's the problem?
- There are two problems. The code you make is still hard to read and edit; and not everyone can do what you can do.
Objection 6
I'm used to . That's what I'll always type!
- You could do that, of course. You'd still make markup text that's hard to read and hard to edit. Some people (very few!) type
<i> and </i> in the edit box. But they'd make life easier for themselves and others if they just typed '' instead.
Objection 7
The proposal is technically too hard to implement, and is not standard anywhere.
- It is in fact easier to implement than
'', ''', and other non-standard markup used at Wikipedia (see Technical Details). Wikipedia is the leader in these matters. Our developers have the capacity to innovate, and others would almost certainly respect their precedent, and follow it.
Objection 8
The new markup is unintuitive, and unlike anything we have already.
- In fact it is quite intuitive, at least for editors who are already used to
'' and '''. Beginners have to learn that markup: the new markup simply uses , instead of ': similar but distinct characters, both with their own stand-alone uses as apostrophe (or single quote mark) and comma, and both also used in markup.
Objection 9
Markup with commas is impractical, and a dead end in development.
- The comma markup has potential to be extended in all sorts of useful ways. Because a comma hardly ever occurs without a space or a digit after it, it is quite readily available for alternative use. We might in future consider markup like this:
,., (to force a break: equivalent to <br />); ,--, (for an en dash: equivalent to – or –); and so on. Such markup could be combined: ,,,--, (for an en dash with a hard space before it: equivalent to – or –). But any such extensions would be negotiable. Accepting the present proposal does not commit anyone to any such extensions.
Objection 10
It's all too hard! How can editors bring about a change like this?
- It is hard work to bring about such a change. But the rewards for this simple innovation are quite significant. And Wikipedia is made up of editors with good ideas that they pursue energetically. That's really what it's all about! We think that developers and decision-makers will take seriously a proposal that grows out of the concerns of competent and committed editors.
|
| Current hard space implementation |
4. Implementation
The coding and system changes involved in implementing new markup are a matter for developers. We have simply outlined the desired behaviour of the markup (see full specification in Technical details, above).
Once the new feature is in place, it would remain harmless and unnoticed by most editors, since they would rarely if ever input two adjacent commas. (This is a virtue of the markup we are promoting, in fact.) The community will therefore need to be informed of the change. This can be done at a variety of forums, including:
The details can be determined once the new markup has been accepted for incorporation. In the meantime, we should stress that nothing is lost through uncertainty and confusion, since the proposal adds a feature, but does not take anything away. In this respect it is safer than almost any alternative. For example, :: has meanings in mathematics (and : has meaning in current wiki markup); ;; may occur when HTML entities fall adjacent to a semicolon (and ; has meaning in wiki markup); `` has a meaning in some existing systems of wiki markup; etc.
By notifications as outlined above, and through general community interactions, the new ,, feature should soon become a part of accepted practice for all editors, just as '' and ''' are already accepted and appreciated.
|
|