Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Theory of Real Computation according to EGC, Lecture notes of Theory of Computation

The Exact Geometric Computation (EGC) mode of computation, which has been developed to address the problem of numerical non-robustness in geometric algorithms. The paper proposes a variant of the analytic approach based on real approximation and introduces explicit sets and explicit algebraic structures. It also discusses the issues at the intersection of continuous and discrete computation and proposes a synthesis of both the algebraic and analytic models.

Typology: Lecture notes

2021/2022

Uploaded on 05/11/2023

alpana
alpana 🇺🇸

4.9

(13)

7 documents

Partial preview of the text

Download Theory of Real Computation according to EGC and more Lecture notes Theory of Computation in PDF only on Docsity! Theory of Real Computation according to EGC∗ Chee Yap Courant Institute of Mathematical Sciences Department of Computer Science New York University April 17, 2007 Abstract The Exact Geometric Computation (EGC) mode of computation has been developed over the last decade in response to the widespread problem of numerical non-robustness in geometric algorithms. Its technology has been encoded in libraries such as LEDA, CGAL and Core Library. The key feature of EGC is the necessity to decide zero in its computation. This paper addresses the problem of providing a foundation for the EGC mode of computation. This requires a theory of real computation that properly addresses the Zero Problem. The two current approaches to real computation are represented by the analytic school and algebraic school. We propose a variant of the analytic approach based on real approximation. • To capture the issues of representation, we begin with a reworking of van der Waerden’s idea of explicit rings and fields. We introduce explicit sets and explicit algebraic structures. • Explicit rings serve as the foundation for real approximation: our starting point here is not R, but F ⊆ R, an explicit ordered ring extension of Z that is dense in R. We develop the approximability of real functions within standard Turing machine computability, and show its connection to the analytic approach. • Current discussions of real computation fail to address issues at the intersection of continuous and discrete computation. An appropriate computational model for this purpose is obtained by extending Schönhage’s pointer machines to support both algebraic and numerical computation. • Finally, we propose a synthesis wherein both the algebraic and the analytic models coexist to play complementary roles. Many fundamental questions can now be posed in this setting, including transfer theorems connecting algebraic computability with approximability. 1 Introduction Software breaks down due to numerical errors. We know that such breakdown has a numerical origin because when you tweak the input numbers, the problem goes away. Such breakdown may take the dramatic form of crashing or looping. But more insidiously, it may silently produce qualitatively wrong results. Such qualitative errors are costly to catch further down the data stream. The economic consequences of general software errors have been documented1 in a US government study [32]. Such problems of geometric nonrobustness are also well-known to practitioners, and to users of geometric software. See [22] for the anatomy of such breakdowns in simple geometric algorithms. In the last decade, an approach called Exact Geometric Computation (EGC) has been shown to be highly effective in eliminating nonrobustness in a large class of basic geometric problems. The fundamental analysis and prescription of EGC may be succinctly stated as follows: ∗Expansion of a talk by the same title at Dagstuhl Seminar on “Reliable Implementation of Real Number Algorithms: Theory and Practice”, Jan 7-11, 2006. This work is supported by NSF Grant No. 043086. 1A large part of the report focused on the aerospace and automobile industries. Both industries are major users of geometric software such as CAD modelers and simulation systems. The numerical errors in such software are well-known and so we infer that part of the cost comes from the kind of error of interest to us. 1 “Geometry is concerned with relations among geometric objects. Basic geometric objects (e.g., points, half-spaces) are parametrized by numbers. Geometric algorithms (a) construct geometric objects and (b) determine geometric relations. In real geometry, these relations are determined by evaluating the signs of real functions, typically polynomials. Algorithms use these signs to branch into different computation paths. Each path corresponds to a particular output geometry. So the EGC prescription says that, in order to compute the correct geometry, it is sufficient to ensure that the correct path is taken. This reduces to error-free sign computations in algorithms.” How do algorithms determine the sign of a real quantity x? Typically, we compute approximations x̃ with increasing precision until the error |x− x̃| is known to be less than |x̃|; then we conclude sign(x) = sign(x̃). Note that this requires interval arithmetic, to bound the error |x− x̃|. But in case x = 0, interval arithmetic does not help – we may reduce the error as much as we like, but the stopping condition (i.e., |x− x̃| < |x̃|) will never show up. This goes to the heart of EGC computation: how to decide if x = 0 [42]. This zero problem has been extensively studied by Richardson [33, 34]. Numerical computation in this EGC mode2 has far reaching implications for software and algorithms. Currently software such as LEDA [11, 27], CGAL [17] and the Core Library [21] supports this EGC mode. We note that EGC assumes that the numerical input is exact (see discussion in [42]). In this paper, we address the problem of providing a computability theory for the EGC mode of compu- tation. Clearly, EGC requires arbitrary precision computation and falls under the scope of the theory of real computation. While the theory of computation for discrete domains (natural numbers N or strings Σ∗) has a widely accepted foundation, and possesses a highly developed complexity theory, the same cannot be said for computation over an uncountable and continuous domain such as R. Currently, there are two distinct approaches to the theory of real computation. We will call them the analytic school and the algebraic school respectively. The analytic school goes back to Turing (1936), Grzegorczyk (1955) and Lacombe (1955) (see [39]). Modern proponents of this theory include Weihrauch [39], Ko [23], Pour-El and Richards [31] and others. In fact, there are at least six equivalent versions of analytic computability, depending on one’s preferred starting point (metric spaces, domain-theory, etc) [20, p. 330]. In addition, there are complementary logical or descriptive approaches, based on algebraic specifications or equations. (e.g., [20]). But approaches based on Turing machines are most congenial to our interest in complexity theory. Here, real numbers are represented by rapidly converging Cauchy sequences. But an essential extension of Turing machines is needed to handle such inputs. In Weihrauch’s TTE approach [39], Turing machines are allowed to compute forever to handle infinite input/output sequences. For the purposes of defining complexity, we prefer Ko’s variant [23], using oracle Turing machines that compute in finite time. There is an important branch of the analytic school, sometimes known3 as the Russian Approach [39, Chap. 9]. Below, we will note the close connections between the Russian Approach (Kolmogorov, Uspenskǐi, Mal’cev) and our work. The algebraic school goes back to the study of algebraic complexity [7, 10]. The original computational model here is non-uniform (e.g., straightline programs). The uniform version of this theory has been ad- vocated as a theory of real computation by Blum, Shub and Smale [6]. Note that this uniform algebraic model is also the de facto computational model of theoretical computer science and algorithms. Here, the preferred model is the Real RAM [2], which is clearly equivalent to the BSS model. In the algebraic school, real numbers are directly represented as atomic objects. The model allows the primitive operations to be carried out in a single step, without error. We can also compare real numbers without error. Although the BSS computational model [6] emphasizes ring operations (+,−,×), in the following, we will allow our algebraic model to use any chosen set Ω of real operations. We have noted that the zero problem is central to EGC: sign determination implies zero determination. Conversely, if we can determine zero, we can also determine sign under mild assumptions (though the complexity can be very different). A natural hierarchy of zero problems can be posed [42]. This hierarchy can be used to give a natural complexity classification of geometric problems. But these distinctions are lost in the analytic and algebraic approaches: the zero problem is undecidable in analytic approach ([23, Theorem 2.5, p. 44]); it is trivial in the algebraic computational model. We need a theory where the complexity of 2Likewise, one may speak of the “numerical analysis mode”, the “computer algebra mode”, or the “interval arithmetic mode” of computing. See [41]. 3We might say the main branch practices the Polish Approach. 2 strings (Σ∗), “explicit” refer to computing over some abstract domain (S). Further, we reserve the term “computability” for real numbers and real functions (see Section 4), in the sense used by the analytic school [39, 23]. Representation of sets. We now consider arbitrary sets S, T . Call ρ : T S a representation of S (with representing set T ) if ρ is an onto function. If ρ(t) = s, we call t a representation element of s. Relative to ρ, we say t, t′ are equivalent, denoted t ≡ t′ if ρ(t) ≡ ρ(t′) (recall ≡ means equality is in the strong sense). In case T = Σ∗ for some alphabet Σ, we call ρ a notation. It is often convenient to identify Σ∗ with N, under some bijection that takes n ∈ N to n ∈ Σ∗. Hence a representation of the form ρ : N S is also called a notation. We generally use ‘ν’ instead of ‘ρ’ for notations. For computational purposes (according to Turing), we need notations. Our concept of notation ν and Mal’cev’s numbering α are closely related: the fact that domain(α) ⊆ N while domain(ν) ⊆ Σ∗ is not consequential since we may identify N with Σ∗. But it is significant that ν is partial, while α is total. Unless otherwise noted, we may (wlog) assume Σ = {0, 1}. Note that if a set has a notation then it is countable. A notation ν is recursive if the set8 Eν := {(w,w′) ∈ (Σ∗)2 : ν(w) ≡ ν(w′)} is recursive. In this case, we say S is ν-recursive. If S is ν-recursive then the set Dν := {w ∈ Σ∗ : ν(w) =↓} (= domain(ν)) is recursive: to see this, note that w ∈ Dν iff (w,w↑) 6∈ Eν where w↑ is any word such that ν(w↑) =↑. It is important to note that “recursiveness of ν” does not say that the function ν is a recursive function. Indeed, such a definition would not make sense unless S is a set of strings. The difficulty of defining explicit sets amounts to providing a substitute for defining “recursiveness of ν” as a function. Tentatively, suppose we say S is “explicit” (in quotes) if there “exists” a recursive notation ν for S. Clearly, the “existence” here cannot have the standard understanding – otherwise we have the trivial consequence that a set S is “explicit” iff it is countable. One possibility is to understand it in some intuitionistic sense of “explicit existence” (e.g., [4, p. 5]). But we prefer to proceed classically. To illustrate some issues, consider the case where S ⊆ N. There are two natural notations for S: the canonical notation of S is νS : N S where νS(n) = n if n ∈ S, and otherwise νS(n) =↑. The ordered notation of S is ν′S : N S where ν′S(n) = i if i ∈ S and the set {j ∈ S : j < i} has i elements. Let S be the halting set K ⊆ N where n ∈ K iff the nth Turing machine on input string n halts. The canonical notation νK is not “explicit” since EνK is not recursive. But the ordered notation ν′K is “explicit” since Eν′ K is the trivial diagonal set {(n, n) : n ∈ N}. On the other hand, ν′K does not seem to be a “legitimate” way of specifying notations (for instance, it is even not a computable function). The problem we face is to distinguish between notations such as νK and ν′K . Our first task is to distinguish a legitimate set of notations. We consider three natural ways to construct notations: let νi : Σ∗ i Si (i = 1, 2) be notations and # is a symbol not in Σ1 ∪ Σ2. 1. (Cartesian product) The notation ν1 × ν2 for the set S1 × S2 is given by: ν1 × ν2 : (Σ∗ 1 ∪ Σ∗ 2 ∪ {#})∗ S1 × S2 where (ν1 × ν2)(w1#w2) = (ν1(w1), ν2(w2)) provided νi(wi) =↓ (i = 1, 2); for all other w, we have (ν1 × ν2)(w) =↑. 2. (Kleene star) The notation ν∗1 for finite strings over S1 is given by: ν∗1 : (Σ∗ 1 ∪ {#})∗ S∗ 1 where ν∗1 (w1#w2# · · ·#wn) = ν1(w1)ν1(w2) · · · ν1(wn) provided ν1(wj) =↓ for all j; for all other w, we have (ν∗1 )(w) =↑. 3. (Restriction) For an arbitrary function f : Σ∗ 1 Σ∗ 2, we obtain the following notation ν2|f : Σ∗ 1 T 8The alphabet for the set Eν may be taken to be Σ∪{#} where # is a new symbol, and we write “(w, w′)” as a conventional rendering of the string w#w′. 5 where ν2|f (w) = ν2(f(w)) and T = range(ν2 ◦ f) ⊆ S2. Thus ν2|f is essentially the function compo- sition, ν2 ◦ f , except that the nominal range of ν2 ◦ f is S2 instead of T . If f is a recursive function, then we call this operation recursive restriction. We now define “explicitness” by induction: a notation ν is explicit if [Base Case] ν : Σ∗ S is a 1-1 function and S is finite, or [Induction] there exist explicit notations ν1, ν2 such that ν is one of the notations ν1 × ν2, ν∗1 , ν1|f where f is recursive. A set S is explicit if there exists an explicit notation for S. Informally, an explicit set is obtained by repeated application of Cartesian product, Kleene star and recursive restriction. Note that Cartesian product, Kleene star and restriction are analogues (respectively) of the Axiom of pairing, Axiom of powers and Axiom of Specification in standard set theory ([19, pp. 9,6,19]). Let us note some applications of recursive restriction: suppose ν : Σ∗ S is an explicit notation. • (Change of alphabet) Notations can be based on any alphabet Γ: we can find a suitable recursive function f such that ν|f : Γ∗ S is an explicit notation. We can further make ν|f a 1-1 function. • (Identity) The identity function ν : Σ∗ → Σ∗ is explicit: to see this, begin with ν0 : Σ∗ Σ where ν0(a) = a if a ∈ Σ and ν0(w) =↑ otherwise. Then ν can be obtained as a recursive restriction of ν∗0 . Thus, Σ∗ is an explicit set. • (Subset) Let T be a subset of S such that D = {w : ν(w) ∈ T} is recursive. If ιD is the partial identity function of D, then ν|ιD is an explicit notation for T . We denote ν|ιD more simply by ν|T . • (Quotient) Let ∼ be an equivalence relation on S, we want a notation for S/∼ (the set of equivalence classes of ∼). Consider the set E = {(w,w′) : ν(w) ∼ ν(w′) or ν(w) = ν(w′) =↑}. We say ∼ is recursive relative to ν if E is a recursive set. Define the function f : Σ∗ → Σ∗ via f(w) = min{w′ : (w,w′) ∈ E} (where min is based on any lexicographic order ≤LEX on Σ∗). If E is recursive then f is clearly a recursive function. Then the notation ν|f , which we denote by ν/∼ (1) can be viewed as a notation for S/∼, provided we identify S/∼ with a subset of S (namely, each equivalence class of S/∼ is identified with a representative from the class). This identification device will be often used below. We introduce a normal form for explicit notations. Define a simple notation to be one obtained by applications of the Cartesian product and Kleene-star operator to a base case (i.e., to a notation ν : Σ∗ S that is 1− 1 and S is finite). A simple set is one with a simple notation. In other words, simple sets do not need recursive restriction for their definition. A normal form notation νS for a set S is one obtained as the recursive restriction of a simple notation: νS = ν|f for some simple notation ν and recursive function f . Lemma 1 (Normal form). If S is explicit, then it has a normal form notation νS. Proof. Let ν0 : Σ∗ S be an explicit notation for S. (0) If S is a finite set, then the result is trivial. (1) If ν0 = ν1 × ν2 then inductively assume the normal form notations νi = ν′i|f ′ i (i = 1, 2). Let ν = ν′1 × ν′2 and for wi ∈ domain(ν′i), define f by f(w1#w2) = f1(w1)#f2(w2) ∈ S. Clearly f is recursive and ν|f is an explicit notation for S. (2) If ν0 = ν∗1 then inductively assume a normal form notation ν1 = ν′1|f ′ 1 . Let ν = (ν′1) ∗ and for all wj ∈ domain(ν′1), define f(w1#w2# · · ·#wn) = f1(w1)#f1(w2)# · · ·#f1(wn) ∈ S. So ν|f is an explicit notation for S. (3) If ν0 = ν1|f1 , then inductively assume the normal form notation ν1 = ν′1|f ′ 1 . Let ν = ν1 and f = f ′ 1 ◦ f1. Clearly, ν|f is an explicit notation for S. Q.E.D. The following is easily shown using normal form notations: Lemma 2. If ν is explicit, then the sets Eν and Dν are recursive. 6 In the special case where S ⊆ N or S ⊆ Σ∗, we obtain: Lemma 3. A subset of S ⊆ N is explicit iff S is partial recursive (i.e., recursively enumerable). Proof. If S is explicit, then any normal form notation ν : Σ∗ S has the property that ν is a recursive function, and hence S is recursively enumerable. Conversely, if S is recursively enumerable, it is well-known that there is a total recursive function f : N → N whose range is S. We can use f as our explicit notation for S. Q.E.D. We thus have the interesting conclusion that the halting set K is an explicit set, but not by virtue of the canonical (νK) or ordered (ν′K) notations discussed above. Moreover, the complement of K is not an explicit set, confirming that our concept of explicitness is non-trivial (i.e., not every countable set is explicit). Our lemma suggests that explicit sets are analogues of recursively enumerable sets. We could similarly obtain analogues of recursive sets, and a complexity theory of explicit sets can be developed. The following data structure will be useful for describing computational structures. Let L be any set of symbols. To motivate the definition, think of L as a set of labels for numerical expressions. E.g., L = Z ∪ {+,−,×}, and we want to define arithmetic expressions labeled by L. An ordered L-digraph G = (V,E;λ) is a directed graph (V,E) with vertex set V = {1, . . . , n} (for some n ∈ N) and edge set E ⊆ V ×V , and a labeling function λ : V → L such that the set of outgoing edges from any vertex v ∈ V is totally ordered. Such a graph may be represented by a set {Lv : v ∈ V } where each Lv is the adjacency list for v, having the form Lv = (v, λ(v);u1, . . . , uk) where k is the outdegree of v, and each (v, ui) is an edge. The total order on the set of outgoing edges from v is specified by this adjacency list. We deem two such graphs G = (V,E;λ) and G′ = (V ′, E′;λ′) to be the same if, up to a renaming of the vertices, they have the same set of adjacency lists (so the identity of vertices is unimportant, but their labels are). Let DG(L) be the set of all ordered L-digraphs. Lemma 4. Let S, T be explicit sets. Then the following sets are explicit: (i) Disjoint union S ] T , (ii) Finite power set 2̂S (set of finite subsets of S), (iii) The set of ordered S-digraphs DG(S). Proof. Let x, xi and y be representation elements for x, xi ∈ S, y ∈ T . We use standard encoding tricks. Assume # is a new symbol. (i) In disjoint union, tag the representations of S by using #x instead of x, but y is unchanged. (ii) For the finite power set 2̂S , we apply recursive restriction to the notation ν∗ for S∗: define f(x1# · · ·#n) = xπ(1)# · · ·#xπ(m) where xπ(1) < · · · < xπ(m) (using any lexicographical ordering < on strings), assuming that {x1, . . . , xn} = {xπ(1), . . . , xπ(m)}. Then ν∗|f is a notation for 2̂S , assuming that we identify 2̂S with a suitable subset of S∗. (iii) Recall the representation of an ordered S-digraph above, as a set of adjacency lists, {Lv : v ∈ V }. The adjacency lists can be represented via the Kleene star operation, and for the set of adjacency lists we use the finite power set method of (ii). Vertices v ∈ N are represented by binary numbers. (Computationally, checking equivalent representations for ordered S-digraphs is highly nontrivial since the graph isomorphism problem is embedded here.) Q.E.D. Convention for Representation Elements. In normal discourse, we prefer to focus on a set S rather than on its representing set T (under some representation ρ : T S). We introduce an “underbar- convention” to facilitate this. If x ∈ S, we shall write x (x-underbar) to denote some representing element for x (so x ∈ T is an “underlying representation” of x). This convention makes sense when the representation ρ is understood or fixed. Note that we have already used this convention in the above proof: writing “x” allows us to acknowledge that it is the representation of x in an unobtrusive way. The fact that “x” is under-specified (non-unique) is usually harmless. Example: Dyadic numbers. Let D := Z[ 12 ] = {m2n : m,n ∈ Z} denote the set of dyadic numbers (or bigfloats, in programming contexts). Let Σ2 = {0, 1, •,+,−}. A string w = σbn−1bn−2 · · · bk • bk−1 · · · b0 (2) 7 We must discuss substructures. By a substructure of (S, Ω) we mean (S′,Ω′) such that S′ ⊆ S, there is a bijection between Ω′ and Ω, and each f ′ ∈ Ω′ is the restriction to S′ of the corresponding operation or predicate f ∈ Ω. Thus, we may speak of subfields, subrings, etc. If (S, Ω) is ν-explicit, then (S′,Ω′) is a ν-explicit substructure of (S, Ω) if S′ is a ν-explicit subset of S and (S′,Ω′) is a a substructure of (S, Ω). Thus, we have explicit subrings, explicit subfields, etc. If (S′,Ω′) is an explicit substructure of (S, Ω), we call (S, Ω) an explicit extension of (S′,Ω′). The following shows the explicitness of some standard algebraic constructions: Lemma 7. Let ν2 be the normal form notation for D = Z[ 12 ] in (3). (i) D is a ν2-explicit ordered ring. (ii) N ⊆ Z ⊆ D are ν2-explicit ordered subrings. (iii) If D is an explicit domain, then the quotient field Q(D) is an explicit ring extension of D. (iv) If R is an explicit ring, then the polynomial ring R[X] is an explicit ring extension of R. (v) If F is an explicit field, then any simple algebraic extension F (θ) is an explicit field extension of F . Proof. (i)-(ii) are obvious. The constructions (iii)-(v) are standard algebraic constructions; these con- structions can be implemented using operations of explicit sets. Briefly: (iii) The standard representation of Q(D) uses D2 as representing set. This is a direct generalization of the construction above, which gave an explicit notation ν for Q (= Q(Z)) starting from an explicit notation for Z. Now, we need to further verify that the field operations of Q(D) are ν-explicit. (iv) The standard representation of R[X] uses R∗ (Kleene star) as representing set: ρ : R∗ R[X]. Since R is explicit, so is R∗, and hence R[X]. All the polynomial ring operations are also recursive relative to this notation for R[X]. (v) Assume θ is the root of an irreducible polynomial p(X) ∈ F [X] of degree n. The elements of F (θ) can be directly represented by elements of Fn (n-fold Cartesian product): thus, if ν is an explicit notation for F , then νn is an explicit notation of Fn = F (θ). The ring operations of F (θ) are reduced to polynomial operations modulo p(X), and division is reduced to computing inverses using the Euclidean algorithm. It is easy to check that these operations are νn-explicit. Q.E.D. This lemma is essentially a restatement of corresponding results in [18]. It follows that standard algebraic structures such as Q[X1, . . . , Xn] or algebraic number fields are explicit. Clearly, many more standard constructions can be shown explicit (e.g., explicit matrix rings). The next lemma uses constructions whose explicitness are less obvious: let IDn(F ) denote the set of all ideals of F [X1, . . . , Xn] where F is a field. For ideals I, J ∈ IDn(F ), we have the ideal operations of sum I+J , product IJ , intersection I∩J , quotient I : J , and radical √ I [40, p. 25]. These operations are all effective, for instance, using Gröbner basis algorithms [40, chap. 12]. Lemma 8. Let F be an explicit field. Then the set IDn(F ) of ideals has an explicit notation ν, and the ideal operations of sum, product, intersection, quotient and radical are ν-explicit. Proof. From Lemma 7(iv), F [X1, . . . , Xn] is explicit. From Lemma 4(ii), the set S of finite subsets of F [X1, . . . , Xn] is explicit. Consider the map ρ : S → IDn(F ) where ρ({g1, . . . , gm}) is the ideal generated by g1, . . . , gm ∈ F [X1, . . . , Xn]. By Hilbert’s basis theorem [40, p. 302], ρ is an onto function, and hence a representation. If ν : Σ∗ S is an explicit notation for S, then ρ ◦ ν is a notation for IDn(F ). To show that this notation is explicit, it is enough to show that the equivalence relation Eρ is decidable (cf. (1)). This amounts to checking if two finite sets {f1, . . . , f`} and {g1, . . . , gm} of polynomials generate the same ideal. This can be done by computing their Gröbner bases (since such operations are all rational and thus effective in an explicit field), and seeing each reduces the other set of polynomials to 0. Let S/Eρ denote the equivalence classes of S; by identifying S/Eρ with the set IDn(F ), we obtain an explicit notation for IDn(F ), ν/Eρ : Σ∗ S/Eρ. The (ν/Eρ)-explicitness of the various ideal operations now follows from known algorithms, using the notation (ν/Eρ). Q.E.D. Well-ordered sets. Many algebraic constructions (e.g., [38, chap. 10]) are transfinite constructions, e.g., the algebraic closure of fields. The usual approach for showing closure properties of such constructions depends on the well-ordering of sets (Zermelo’s theorem), which in turn depends on the Axiom of Choice (e.g., [19] or [38, chap. 9]). Recall that a strict total ordering < of a set S is a well-ordering if every non-empty 10 subset of S has a least element. In explicit set theory, we can replace such axioms by theorems, and replace non-constructive constructions by explicit ones. Lemma 9. A ν-explicit set is well-ordered. This well-ordering is ν-explicit. Proof. Let ν : Σ∗ S be an explicit notation for S. Now Σ∗ is well-ordered by any lexicographical order ≤LEX on strings. This induces a well-ordering ≤ν on the elements x, y ∈ S as follows: let wx := min{w ∈ Σ∗ : ν(w) = x}. Define x ≤ν y if wx ≤LEX wy. The predicate ≤ν is clearly ν-explicit. Moreover it is a well-ordering. Q.E.D. The proof of Theorem 12 below depends on such a well-ordering. Expressions. Expressions are basically “universal objects” in the representation of algebraic constructions. Let Ω̂ be a (possibly infinite) set of symbols for algebraic operations, and k : Ω̂ → N assigns an “arity” to each symbol in Ω̂. The pair (Ω̂, k) is also called a signature. Suppose Ω is a set of operations defined on a set S. To prove the closure of S under the operations in Ω, we consider “expressions” over Ω̂, where each ĝ ∈ Ω̂ is interpreted by a corresponding g ∈ Ω and k(ĝ) is the arity of g. To construct the closure of S under Ω, we will use “expressions over Ω̂” as the representing set for this closure. Let Ω̂(k) denote the subset of Ω̂ comprising symbols with arity k. Recall the definition of the set DG(Ω̂) of ordered Ω̂-digraphs. An expression over Ω̂ is a digraph G ∈ DG(Ω̂) with the property that (i) the underlying graph is acyclic and has a unique source node (the root), and (ii) the outdegree of a node v is equal to the arity of its label λ(v) ∈ Ω̂. Let Expr(Ω̂, k) (or simply, Expr(Ω̂)) denote the set of expressions over Ω̂. Lemma 10. Suppose Ω̂ is a ν-explicit set and the function k : Ω̂→ N is12 ν-explicit. Then the set Expr(Ω̂) of expressions is an explicit subset of DG(Ω̂). Proof. The set DG(Ω̂) is explicit by Lemma 4(iii). Given a digraph G = (V,E;λ) ∈ DG(Ω̂), it is easy to algorithmically check properties (i) and (ii) above in our definition of expressions. Q.E.D. Universal Real Construction. A fundamental result of field theory is Steinitz’s theorem on the existence and uniqueness of algebraic closures of a field F [38, chap. 10]. In standard proofs, we only need the well- ordering principle. To obtain the “explicit version” of Steinitz’s theorem, it is clear that we also need F to be explicit. But van der Waerden pointed out that this may be insufficient: in general, we need another explicitness assumption, namely the ability to factor over F [X] (see [18]). Factorization in an explicit UFD (unique factorization domain) such as F [X] is equivalent to checking irreducibility [18, Theorem 4.2]. If F is a formally real field, then a real algebraic closure F of F is an algebraic extension of F that is formally real, and such that no proper algebraic extension is formally real. Again F exists [40, chap. 5], and is unique up to isomorphism. Our goal here is to give the explicit analogue of Steinitz’s theorem for real algebraic closure. If p, q ∈ F [X] are polynomials, then we consider the operations of computing their remainder pmod q, their quotient pquo q, their gcd GCD(p, q), their resultant resultant(p, q), the derivative dp dX of p, the square-free part sqfree(p) of p, and the Sturm sequence Sturm(p) of p. Thus Sturm(p) is the sequence (p0, p1, . . . , pk) where p0 = p, p1 = dp dX , and pi+1 = pi−1 mod pi (i = 1, . . . , k), and pk+1 = 0. These are all explicit in an explicit field: Lemma 11. If F is a ν-explicit field, and p(X), q(X) ∈ F [X], then the following operations are13 ν-explicit: pmod q, pquo q, GCD(p, q), dp dX , resultant(p, q), sqfree(p), Sturm(p). 12Strictly speaking, k is (ν, ν′)-explicit where ν′ is the notation for N. 13Technically, these operations are ν′-explicit where F [X] is an ν′-explicit set, and ν′ is derived from ν using the above standard operators. 11 Proof. Let prem(p, q) and pquo(p, q) denote the pseudo-remainder and pseudo-quotient of p(X), q(X) [40, Lemmas 3.5, 3.8]. Both are polynomials whose coefficients are determinants in the coefficients of p(X) and q(X). Hence prem(p, q) and pquo(p, q) are explicit operations. The leading coefficients of prem(p, q) and pquo(p, q) can be detected in an explicit field. Dividing out by the leading coefficient, we can obtain pmod q and pquo q from their pseudo-analogues. Similarly, GCD(p, q) and resultant(p, q) can be obtained via subresultant computations [40, p. 90ff]. Clearly, differentiation dp dX is a ν-explicit operation. We can compute sqfree(p) as p/GCD(p, dp/dX). Finally, we can compute the Sturm sequence Sturm(p) because we can differentiate and compute pmod q, and can test when a polynomial is zero. Q.E.D. Some common predicates are easily derived from these operations, and they are therefore also explicit predicates: (a) p|q (p divides q) iff pmod q = 0. (b) p is squarefree iff sqfree(p) = p. Let F be an ordered field. Given p ∈ F [X], an interval I is an isolating interval of p in one of the following two cases: (i) I = [a, a] and f(a) = 0 for some a ∈ F , (ii) I = (a, b) where a, b ∈ F , a < b, p(a)p(b) < 0, and the Sturm sequence of p evaluated at a has one more sign variation than the Sturm sequence of p evaluated at b. It is clear that an isolating interval uniquely identifies a root α in the real algebraic closure of F . Such an α is called a real root of p. In case p is square-free, we call the pair (p, I) an isolating interval representation for α. We may now define the operation Rootk(a0, . . . , an) (k ≥ 1, ai ∈ F ) that extracts the kth largest real root of the polynomial p(X) = ∑n i=0 aiX i. This operation is undefined in case p(X) has less than k real roots. For the purposes of this paper, we shall define the real algebraic closure of F , denoted F , to be the smallest ordered field that is an algebraic extension of F and that is closed under the operation Rootk(a0, . . . , an) for all a0, . . . , an ∈ F and k ∈ N. For other characterizations of real algebraic closures, see e.g., [40, Theorem 5.11]. Theorem 12. Let F be an explicit ordered field. Then the real algebraic closure F of F is explicit. This field is unique up to F -isomorphism (isomorphism that leaves F fixed). Unlike Steinitz’s theorem [38, chap. 10], this result does not need the Axiom of Choice; and unlike the explicit version of Steinitz’s theorem [18], it does do not need factorization in F [X]. But the ordering in F must be explicit. Proof. For simplicity in this proof, we will assume the existence of F (see [40]). So our goal is to show its explicitness, i.e., we must show an explicit notation ν for F , and show that the field operations as well as Rootk(a0, . . . , an) are ν-explicit. Consider the set Ω̂ := F ∪ {+,−,×,÷} ∪ {Rootk : n ∈ N, 1 ≤ k ≤ n} of operation symbols. The arity of these operations are defined as follows: the arity of x ∈ F is 0, arity of g ∈ {+,−,×,÷} is 2, and arity of Rootk is n + 1. It is easy to see that Ω̂ is explicit, and hence Expr(Ω̂) is explicit. Define a natural evaluation function, Eval : Expr(Ω̂) F . (4) Let e be an expression. We assign val(u) ∈ F to each node u of the underlying DAG of e, in bottom-up fashion. Then Eval(e) is just the value of the root. If any node has an undefined value, then Eval(e) =↑. The leaves are assigned constants val(u) ∈ F . At an internal node u labeled by a field operation (+,−,×,÷), we obtain val(u) as the result of the corresponding field operation on elements of F . Note that a division by 0 results in val(u) =↑. Similarly, if the label of u is Rootk and its children are u0, . . . , un (in this order), then val(u) is equal to the kth largest real root (if defined) of the polynomial ∑n j=0 val(uj)Xj . We notice that the evaluation function (4) is onto, and hence is a representation of the real algebraic closure F . Since Expr(Ω̂) is an explicit set, Eval is a notation of F . It remains to show that Eval is an explicit notation. To conclude that F is an explicit set via the notation (4), we must be able to decide if two expressions e, e′ represent the same value. By forming the expression e−e′, this is reduced to deciding if the value of a proper expression e is 0. To do this, assume that for each expression e, we can either determine that it is improper or else we can compute an isolating interval representation (Pe(X), Ie) for its value val(e). To determine if val(e) = 0, we first compute the isolating interval representation (Pe, Ie). If Pe(X) = ∑n i=0 aiX i, and if val(e) 6= 0, then Cauchy’s lower bound [40, Lem. 6.7] holds: |val(e)| > |a0|/(|a0|+max{|ai| : i = 1, . . . , n}). 12 Base Reals. We begin with a set of “base reals” that is suitable for approximating other real numbers. Using the theory of explicit algebraic structures, we can now give a succinct definition (cf. [41]): a subset F ⊆ R is called a ring of base reals if F is an explicit ordered ring extension of the integers Z, such that F is dense in R. Elements of F are called base reals. The rational numbers Q, or the dyadic (or bigfloat) numbers D = Z[ 12 ], or even the real algebraic numbers, can serve as the ring of base reals. Since F is an explicit ring, we can perform all the ring operations and decide if two base reals are equal. Being dense, we can use F to approximate any real number to any desired precision. We insist that all inputs and outputs of our algorithms are base reals. This approach reflects very well the actual world of computing: in computing systems, floating point numbers are often called “reals”. Basic foundation for this form of real computation goes back to Brent [8, 9]. It is also clear that all practical development of real computation (e.g., [28, 30, 25]), as in our work in EGC, also ultimately depend on approximations via base reals. The choice of D as the base reals is the simplest: assuming that F is closed under the map x 7→ x/2, then D ⊆ F. In the following, we shall assume this property. Then we can do standard binary searches (divide by 2), work with dyadic notations, and all the results in [41] extends to our new setting. Error Notation. We consider both absolute and relative errors: given x, x̃, p ∈ R, we say that x̃ is an absolute p-bit approximation of x if |x̃− x| ≤ 2−p. We say x̃ is a relative p-bit approximation of x if |x̃− x| ≤ 2−p|x|. The inequality |x̃ − x| ≤ 2−p is equivalent to x̃ = x + θ2−p where |θ| ≤ 1. To avoid introducing an explicit variable θ, we will write this in the suggestive form “x̃ = x ± 2−p”. More generally, whenever we use the symbol ‘±’ in a numerical expression, the symbol ± in the expression should be replaced by the sequence “+θ” where θ is a real variable satisfying |θ| ≤ 1. Like the big-Oh notations, we think of the ±-convention as a variable hiding device. As further example, the expression “x(1±2−p)” denotes a relative p-bit approximation of x. Also, write (x±ε) and [x±ε] (resp.) for the intervals (x−ε, x+ε) and [x−ε, x+ε]. Absolute and Relative Approximation. The ring F of base reals is used for approximation purposes. So all approximation concepts will depend on this choice. If f : S ⊆ Rn R is15 a real function, we call a function f̃ : (S ∩ Fn)× F F (6) an absolute approximation of f if for all d ∈ S ∩ Fn and p ∈ F, we have f̃(d, p) =↓ iff f(d) =↓. Furthermore, when f(d) =↓ then f̃(d, p) = f(d) ± 2−p. We can similarly define what it means for f̃ to be a relative approximation of f . Let Af , Rf denote the set of all absolute, respectively relative, approximations of f . If f̃ ∈ Af ∪ Rf , we also write “f̃(d)[p]” instead of f̃(d, p) to distinguish the precision parameter p. We remark that this parameter p could also be restricted to N for our purposes; we often use this below. We say f is absolutely approximable (or A-approximable) if some f̃ ∈ Af is explicit. Likewise, f is partially absolutely approximable (or partially A-approximable) if some f̃ ∈ Af is partially explicit. Analogous definitions hold for f being relatively approximable (or R-approximable) and partially rela- tively approximable (or partially R-approximable). The concept of approximability (in the four variants here) is the starting point of our approach to real computation. Notice that “real approximation” amounts to “explicit computation on the base reals”. Remark on nominal domains of partial functions. It may appear redundant to consider a function f that is a partial function and whose nominal domain S is a proper subset of Rn. In other words, by specifying S = Rn or S = domain(f), we can either avoid partial functions, or avoid S 6= Rn. This attitude is implicit in recursive function theory, for instance. It is clear that choice of S affects the computability of f since S determines the input to be fed to our computing devices. In the next section, the generic function f(x) = √ x is used to illustrate this fact. Intuitively, the definability of f at any point x is intrinsic to the function f , 15This is just a short hand for “f : S R and S ⊆ Rn”. Similarly, f : S ⊆ Rn T ⊆ R is shorthand for f : S T with the indicated containments for S and T . 15 but its points of undefinability is only incidental to f (an artifact of the choice of S). Unfortunately, this intuition can be wrong: the points of undefinability of f can tell us much about the global nature of f . To see this, consider the fact that the choice of S is less flexible in algebra than in analysis. In algebra, we are not free to turn the division operation in a field into a total function, by defining it only over non-zero elements. In analysis, it is common to choose S so that f behave nicely: e.g., f has no singularity, f is convergent under Newton iteration, etc. But even here, this choice is often not the best and may hide some essential difficulties. So in general, we do not have the option of specifying S = Rn or S = domain(f) for a given problem. Much of what we say in this and the next section are echos of themes found in [23, 39]. Our two main goals are (i) to develop the computability of f in the setting of a general nominal domain S, and (ii) to expose the connection between computability of f with its approximability. A practical theory of real computability in our view should be largely about approximability. Regular Functions. Let f : S ⊆ R R. In [23] and [39], the real functions are usually restricted to S = [a, b], (a, b) or S = R; this choice is often is essential to the computability of f . To admit S which goes beyond the standard choices, we run into pathological examples such as S = R \ F. This example suggests that we need an ample supply of base reals in S. We say that a set S ⊆ R is regular if for all x ∈ S and n ∈ N, there exists y ∈ S ∩ F such that y = x ± 2−n. Thus, S contains base reals arbitrarily close to any member. We say f is regular if domain(f) is regular. Note that regularity, like all our approximability concepts, is defined relative to F. Cauchy Functions. The case n = 0 in (6) is rather special: in this case, f is regarded as a constant function, representing some real number x ∈ R. An absolute approximation of x is any function f̃ : F → F where f̃(p) = x± 2−p for all p ∈ F. We call f̃ a Cauchy function for x. The sequence (f̃(0), f̃(1), f̃(2), . . .) is sometimes called a rapidly converging Cauchy sequence for x; relative to f̃ , the p-th Cauchy convergent of x is f̃(p). Extending the above notation, we may write Ax for the set of all Cauchy functions for x. But note that f̃ is not just an approximation of x, but it uniquely identifies x. Thus f̃ is a representation of x. So by our underbar convention, we prefer to write “x” for any Cauchy function of x. Also write “x[p]” (instead of x(p)) for the pth convergent of x. We can also let Rx denote the set of relative approximations of x. If some x ∈ Ax (x ∈ Rx) is explicit, we say x is A-approximable (R-approximable). Below we show that x is R-approximable iff x is A- approximable. Hence we may simply speak of “approximable reals” without specifying whether we are concerned with absolute or relative errors. Among the Cauchy functions in Ax, we identify one with nice monotonicity properties: every real number x can be written as n + 0.b1b2 · · · where n ∈ Z and bi ∈ {0, 1}. The bi’s are uniquely determined by x when x 6∈ D. Otherwise, all bi’s are eventually 0 or eventually 1. For uniqueness, we require all bi’s to be eventually 0. Using this unique sequence, we define the standard Cauchy function of x via βx[p] = n + p∑ i=1 bi2−i. For instance, −5/3 is written −2 + 0.01010101 · · · . This defines the Cauchy function βx[p] for all p ∈ N. Technically, we need to define βx[p] for all p ∈ F: when p < 0, we simply let βx[p] = βx[0]; when p > 0 and is not an integer, we let βx[p] = βx[dpe]. We note some useful facts about this standard function: Lemma 13. Let x ∈ R and p ∈ N. (i) βx[p] ≤ βx[p + 1] ≤ x. (ii) x− βx[p] < 2−p. (iii) If y = βx[p]± 2−p, then for all n ≤ p, we also have y = βx[n]± 2−n. In particular, there exists y ∈ Ay such y[n] = βx[n] for all n ≤ p. 16 (iv) There is a recursive procedure B : F × N → F such that for all x ∈ F, p ∈ N, B(x, p) = βx[p]. In particular, for each x ∈ F, the standard Cauchy function of x is recursive. To see (iii), it is sufficient to verify that if y = βx[p] ± 2−p then y = βx[p − 1] ± 21−p. Now βx[p] = βx[p− 1] + δ2−p where δ = 0 or 1. Hence y = βx[p]± 2−p = (βx[p− 1]± 2−p)± 2−p = βx[p− 1]± 21−p. Explicit computation with one real transcendental. In general, it is not known how to carry out explicit computations (e.g., decide zero) in transcendental extensions of Q (but see [12] for a recent positive result). However, consider the field F (α) where α is transcendental over F . If F is ordered, then the field F (α) can also be ordered using an ordering where a <′ α for all a ∈ F . Further, if F is explicit, then F (α) is also an explicit ordered field with this ordering <′. But the ordering <′ is clearly non-Archimedean (i.e., there are elements a, x ∈ F (α) such that for all n ∈ N, n|a| < |x|). Now suppose F ⊆ R and α ∈ R (for instance, F = Q and α = π). Then F (α) ⊆ R can be given the standard (Archimedean) ordering < of the reals. Theorem 14. If F ⊆ R is an explicit ordered field, and α ∈ R is an approximable real that is transcendental over F , then the field F (α) with the Archimedean order < is an explicit ordered field. Proof. The field F (α) is isomorphic to the quotient field of F [X], and by Lemma 7(iii,iv), this field is explicit. It remains to show that the Archimedean order < is explicit. Let P (α)/Q(α) ∈ F (α) where P (X), Q(X) ∈ F [X] and Q(X) 6= 0. It is enough to show that we can recognize the set of positive elements of F (α). Now P (α)/Q(α) > 0 iff P (α)Q(α) > 0. So it is enough to recognize whether P (α) > 0 for any P (α) ∈ F [α]. First, we can verify that P (α) 6= 0 (this is true iff some coefficient of P (α) is nonzero). Next, since α is approximable, we find increasingly better approximations α[p] ∈ F of α, and evaluate P (α[p]) for p = 0, 1, 2, . . .. To estimate the error, we derive from Taylor’s expansion the bound P (α) = P (α[p])±δp where δp = ∑ i≥1 2−ip|P (i)(α[p])|. We can easily compute an upper bound βp ≥ |δp|, and stop when |P (α[p])| > βp. Since δp → 0 as p→∞, we can also ensure that βp → 0. Hence termination is assured. Upon termination, we know that P (α) has the sign of P (α[p]). Q.E.D. In particular, this implies that D(π) or Q(e) can serve as the set F of base reals. The choice D(π) may be appropriate in computations involving trigonometric functions, as it allows exact representation of the zeros of such functions, and thus the possibility to investigate the neighborhoods of such zeros computa- tionally. Moreover, we can extend the above technique to any number of transcendentals, provided they are algebraically independent. For instance, π and Γ(1/3) = 2.678938 . . . are algebraically independent and so D(π,Γ(1/3)) would be an explicit ordered field. Real predicates. Given f : S ⊆ R R, define the predicate Signf : S {−1, 0, 1} given by Signf (x) =  0 if f(x) = 0, +1 if f(x) > 0, −1 if f(x) < 0, ↑ else. Define the related predicate Zerof : S {0, 1} where Zerof (x) ≡ |Signf (x)| (so range(Zerof ) ⊆ {0, 1}). By the fundamental analysis of EGC (see Introduction), Signf is the critical predicate for geometric algorithms. We usually prefer to focus on the simpler Zerof predicate because the approximability of these two predicates are easily seen to be equivalent in our setting of base reals (cf. [41]). In general, a real predicate is a function P : S ⊆ Rn R where range(P ) is a finite set. The approximation of real predicates is somewhat simpler than that of general real functions. To treat the next result, we need some new definitions. Let S ⊆ D ⊆ Σ∗. We say S is recursive modulo D if there is a Turing machine that, on input x taken from the set D, halts in the state q↓ if x ∈ S, and in the state q↑ if x 6∈ S. Similarly, S is partial recursive modulo D if there is a Turing machine that, on input x taken from D, halts iff x ∈ S. Let S ⊆ D ⊆ U where U is a ν-explicit set and ν : Σ∗ U . We say S is a (partially) ν-explicit subset of U modulo D if the set {w ∈ Σ∗ : ν(w) ∈ S} is (partial) recursive modulo {w ∈ Σ∗ : ν(w) ∈ D}. Also, denote by rangeF(f) := {f(x) : x ∈ F ∩ S}, the range of f when its domain is restricted to base real inputs. 17 Modulus of continuity. Consider a partial function f : S ⊆ R R. We say f is continuous if for all x ∈ domain(f) and δ > 0, there is an ε = ε(x, δ) > 0 such that if y ∈ domain(f) and y = x ± ε then f(y) = f(x)± δ. We say f is uniformly continuous if for all δ > 0, there is an ε = ε(δ) > 0 such that for all x, y ∈ domain(f), y = x± ε implies f(y) = f(x)± δ. Let m : S × N N. We call m a modulus function if for all x ∈ S, p ∈ N, we have m(x, p) =↑ iff m(x, 0) =↑. We define domain(m) := {x ∈ S : m(x, 0) =↓}. Such a function m is called a modulus of continuity (or simply, modulus function) for f if domain(f) = domain(m) and for all x, y ∈ domain(f), p ∈ N, if y = x± 2−m(x,p) then f(y) = f(x)± 2−p. Call a function m : N→ N a uniform modulus of continuity (or simply, a uniform modulus function) for f if for all x, y ∈ domain(f), p ∈ N, if y = x ± 2−m(p) then f(y) = f(x) ± 2−p. To emphasize the distinction between uniform modulus function and the non-uniform version, we might describe the latter as “local modulus functions”. The following is immediate: Lemma 19. Let f : S ⊆ R R. Then f is continuous iff it has a modulus of continuity. Then f is uniformly continuous iff it has a uniform modulus of continuity. Ko [23] uses uniform18 continuity to characterize computable total real functions of the form f : [a, b]→ R. Our goal is to generalize his characterization to capture real functions with non-compact domains, as well as those that are partial. Our results will characterize computable real functions f : S R where S ⊆ Rn is regular. Use of local continuity simplifies proofs, and avoids any appeal to the Heine-Borel theorem. Multivalued Modulus function of an OTM. We now show how modulus functions can be computed, but they must be generalized to multivalued functions. Such functions are treated in Weihrauch [39]; in particular, computable modulus functions are multivalued [39, Cor. 6.2.8]. Let us begin with the usual (single-valued) modulus function m : S × N N. It is computed by OTMs since one of m’s arguments is a real number. If (x, p) is the input to an OTM N which computes m, we still write “Nx[p]” for the computation of N on (x, p). Notice that, since the output of N comes from the discrete set N, we do not need an extra “precision parameter” to specify the precision of the output (unlike the general situation when the output is a real number). Let M be an OTM that computes some function f : S ⊆ R R. Consider the Cauchy function β+ x ∈ Ax where β+ x [n] = βx[n + 1] for all n ∈ N. (Thus β+ x is a “sped up” form of the standard Cauchy function βx.) Consider the function kM : S × N N where k(x, p) = kM (x, p) := the largest k such that the computation of Mβ+ x [p] queries β+ x [k]. (8) This definition depends on M using β+ x as oracle. If Mβ+ x [p] =↑ then k(x, p) =↑; if Mβ+ x [p] =↓ but the oracle was never queried, define k(x, p) = 0. Some variant of k(x, p) was used by Ko to serve as a modulus function for f . Unfortunately, we do not know how to compute k(x, p) as that seems to require simulating the oracle β+ x using an arbitrary oracle x ∈ Ax. Instead, we proceed as follows. For any x ∈ Ax, let x+ denote the oracle in Ax where x+[n] = x[n + 1] for all n. Define, in analogy to (8), the function kM : AS × N N where AS = ∪{Ax : x ∈ S} and k(x, p) = kM (x, p) := the largest k such that the computation of Mx+ [p] queries x+[k]. (9) As before, if Mx+ [p] =↑ (resp., if the oracle was never queried), then k(x, p) =↑ (resp., k(x, p) = 0). The first argument of k is an oracle, not a real number. For different oracles from the set Ax, we might get different results; such functions k are19 called an intensional functions. Computability of intensional functions are defined using OTM’s, just as for real functions. We can naturally interpret kM as representing a multivalued function (see [39, Section 1.4]) which may be denoted20 kM : S ×N→ 2N with kM (x, p) = {kM (x, p) : x ∈ Ax}. Statements about kM can be suitably interpreted as statements about kM (see below). The following lemma is key: 18Indeed, uniform modulus functions are simply called “modulus functions” in Ko. He has a notion of “generalized modulus function” that is similar to our local modulus functions. 19Intensionality is viewed as the (possible) lack of “extensionality”. We say k is extensional if for all a, b ∈ Ax and p ∈ N, k(a, 0) =↓ iff k(a, p) =↓; moreover, k(a, p) ≡ k(b, p). Extensional functions can be interpreted as (single-valued) partial functions on real arguments. 20Or, kM : S × N →→ N. 20 Lemma 20. Let f : S ⊆ R R be computed by an OTM M , and p ∈ N. (i) If x ∈ domain(f) and x ∈ Ax, then (x± 2−kM (x,p)) ∩ S ⊆ domain(f). (ii) If, in addition, y = x± 2−kM (x,p) and y ∈ S then f(y) = f(x)± 21−p. Proof. Let y = x± 2−kM (x,p) where x ∈ domain(f) and y ∈ S. (i) We must show that y ∈ domain(f). For this, it suffices to show that M halts on the input (y, p) for some y ∈ Ay. Consider the modified Cauchy function y′ given by y′[n] = { x+[n] if n + 1 ≤ kM (x, p) y[n] else. To see that y′ ∈ Ay, we only need to verify that y = y′[n]± 2−n for n ≤ kM (x, p). This follows from y = x± 2−kM (x,p) = (x[n + 1]± 2−n−1)± 2−kM (x,p) = x+[n]± 2−n = y′[n]± 2−n. Since the computation of Mx+ [p] does not query x+[n] for n > kM (x, p), it follows that this computation is indistinguishable from the computation of My′ [p]. In particular, both Mx+ [p] and My′ [p] halt. Thus y ∈ domain(f). (ii) We further show |f(y)− f(x)| ≤ |f(y)−My′ [p]|+ |Mx+ [p]− f(x)| ≤ 2−p + 2−p = 21−p. Q.E.D. Let M be an OTM. Define the intensional function m : AS × N N by m(x, p) := kM (x, p + 1). (10) We call m an (intensional) modulus of continuity for f : S ⊆ R R if21 for all x ∈ S and p ∈ N, we have m(x, p) =↓ iff x ∈ domain(f). In addition, for all x, y ∈ domain(f), if y = x±2−m(x,p) then f(y) = f(x)±2−p. The multivalued function m : S × N → 2N corresponding to m will be called a multivalued modulus of continuity of f . In this case, define domain(m) = domain(m) to be domain(f). It is easy to see that f has a multivalued modulus of continuity iff it has a (single-valued) modulus of continuity. The next result may be compared to [39, Corollary 6.2.8]. Lemma 21. If f : S ⊆ R R is computed by an OTM M then the function m(x, p) of (10) is a modulus of continuity for f . Moreover, m is computable. Proof. To show that m is a modulus of continuity for f , we may assume x ∈ S. Consider two cases: if x 6∈ domain(f) then k(x, p + 1) is undefined. Hence m(x, p) is undefined as expected. So assume x ∈ domain(f). Suppose y = x± 2−m(x,p) = x± 2−kM (x,p+1) and y ∈ S. By the previous lemma, we know that y ∈ domain(f) and f(y) = f(x)± 2−p. Thus m(x, p) is a modulus of continuity for f . To show that m is computable, we construct an OTM N which, on input x ∈ Ax and p ∈ F, simulates the computation of Mx+ [p + 1]. Whenever the machine M queries x[n] for some n, the machine N queries x+[n] = x[n + 1] instead. When the simulation halts, N outputs the largest k such that x+[k] was queried (or k = 0 if there were no oracle queries). Q.E.D. The above proof shows that m(x, p) := kM (x, p + 1) is a modulus of continuity in the following “strong” sense: a multivalued modulus function m for f : S R is said to be strong if x ∈ domain(f) implies [x± 2−m(x,p)] ∩ S ⊆ domain(f). Note that if S is regular, then domain(m) is regular. Thus: Corollary 22. If f : S ⊆ R R is computable, then it has a strong multivalued modulus function that is computable. In particular, f is continuous. 21In discussing intensional functions, it is convenient to assume that whenever we introduce a quantified real variable x, we simultaneously introduce a corresponding universally-quantified Cauchy function variable x ∈ Ax. These two variables are connected by our under-bar convention. That is, “(Qx ∈ S ⊆ R)” should be translated “(Qx ∈ S ⊆ R)(∀x ∈ Ax)” where Q ∈ {∀, ∃}. 21 Modulus cover. An alternative formulation of strong modulus of continuity is this: let F := {(a, b) : a < b, a, b ∈ F} denote the set of open intervals over F. A modulus cover refers to any subset G ⊆ F×N. For simplicity, the typical element in G is written (a, b, p) instead of the more correct ((a, b), p). We call G a modulus cover of continuity (or simply, a modulus cover) for f : S ⊆ R R if the following two conditions hold: (a) For each p ∈ N and x ∈ domain(f), there exists (a, b, p) ∈ G with x ∈ (a, b). (b) For all (a, b, p) ∈ G, we have (a, b)∩S ⊆ domain(f). Moreover, x, y ∈ (a, b)∩S implies f(x) = f(y)±2−p. If the characteristic function χG : F × N {1} of G is (resp., partially) explicit then we say G is (resp., partially) explicit. The advantage of using G over a modulus function m is that we avoid multivalued functions, and the triples of G are parametrized by base reals. Thus we compute the characteristic function of G using ordinary Turing machines while m must be computed by OTMs. We next show that we could interchange the roles of G and m. Lemma 23. For f : S ⊆ R R, the following statements are equivalent: (i) f has a modulus cover G that is partially explicit. (ii) f has a strong multivalued modulus function m that is computable. Proof. (i) implies (ii): If G is available, we can define m(x, p) via the following dovetailing process: let the input be a Cauchy function x. For each (a, b, p) ∈ F× N, we initiate a (dovetailed) computation to do three steps: (1) Check that (a, b, p) ∈ G. (2) Find the first i = 0, 1, . . . such that [x[i]± 2−i] ⊆ (a, b). (3) Output k = − ⌊ log2 min{x[i]− 2−i − a, b− 2−i − x[i]} ⌋ . Correctness of this procedure: since G is partially explicit, step (1) will halt if (a, b, p) ∈ G. Step (2) amounts to checking the predicate a < x < b. If x ∈ domain(f) then steps (1) and (2) will halt for some (a, b, p) ∈ G. The output k in step (3) has the property that if y = x± 2−k and y ∈ S then y ∈ domain(f) and f(y) = f(x) ± 2−p. Thus m is a strong modulus of continuity of f , and our procedure shows m to be computable. (ii) implies22 (i): Suppose f has a modulus function m that is computed by the OTM M . A finite sequence σ = (x0, x1, . . . , xk) is called a Cauchy prefix if there exists a Cauchy function x such that x[i] = xi for i = 0, . . . , k. We say x extends σ in this case. Call σ a witness for a triple (a, b, p) ∈ F×F×N provided the following conditions hold: (4) [a, b] ⊆ ⋂k i=0[xi ± 2−i]. (5) If x extends σ then the computation Mx[p] halts and does not query the oracle for x[n] for any n > k. (6) If Mx[p] outputs ` = m(x, p), then we have 0 < b− a < 2−`. Let G comprise all (a, b, p) that has a witness. The set G is partially explicit since, on input (a, b, p), we can dovetail through all sequences σ = (x0, . . . , xk), checking if σ is a witness for (a, b, p). This amounts to checking conditions (4)-(6). To see that G is a modulus cover for f , we first note that if (a, b, p) ∈ G then (a, b)∩S ⊆ domain(M) = domain(f). Moreover, for all p ∈ N and x ∈ domain(f), we claim that there is some (a, b) ∈ F where (a, b, p) ∈ G and x ∈ (a, b). To see this, consider the computation of Mβx [p] where βx is the standard Cauchy function of x. If the largest query made by this computation to the oracle βx is k, then consider the sequence σ = (βx[0], . . . , βx[k]). Note that x is in the interior of [βx[k]±2−k] = ⋂k i=0[βx[i]±2−i]. If Mβx [p] = `, then we can choose (a, b) ⊆ (βx[k] ± 2−k) such that b − a ≤ 2−` and a < x < b. Also for all y, y′ ∈ (a, b), we have y′ = y ± 2m(y,p) and hence |f(y′)− f(y)| ≤ 2−p. Q.E.D. Corollary 24. If the function f : S R is computable then it has a partially explicit modulus cover G. Moreover, if S is regular then f is regular. 22This proof is kindly provided by V.Bosserhoff and another referee. My original argument required S to be regular. 22 analysts proceed in this manner. Indeed, most numerical analysis books take STEP A only, and rarely discuss the issues in taking STEP B. From a purely practical viewpoint, the algebraic model provides a useful level of abstraction that guide the eventual transfer of algorithmic ideas into actual executable programs. We thus see that on the one hand, the algebraic model is widely used, and on the other hand, severe criticisms arise when it is proposed as the computational model of numerical analysis (and indeed of all scientific computation). This tension is resolved through our scheme where the algebraic model takes its proper place. Pointer Machines. We now face the problem of constructing a computational framework in which the algebraic and numerical worlds can co-exist and complement each other. We wish to ensure from the outset that both discrete combinatorial computation and continuous numerical computation can be naturally expressed in this framework. Following Knuth, we may describe such computation as “semi-numerical”. Current theories of computation (algebraic or analytic or standard complexity theories) do not adequately address such problems. For instance, real computation is usually studied as the problem of computing a real function f : R R even though this is a very special case with no elements of combinatorial computing. In Sections 4 and 5 we followed this tradition. On the other hand, algorithms of computational geometry are invariably semi-numerical [42]. Following [41], we will extend Schönhage’s elegant Pointer Machine Model [36] to provide such a framework. We briefly recall the concept of a pointer machine (or storage modification machine). Let ∆ be any set of symbols, which we call tags. Pointer machines manipulates graphs whose edges are labeled by tags. More precisely, a tagged graph (or ∆-graph) is a finite directed graph G = (V,E) with a distinguished node s ∈ V called the origin and a label function that assigns tags to edges such that outgoing edges from any node have distinct tags. We can concisely write G = (V, s, τ) where s ∈ V and τ : V ×∆ V is the tag function. Note that τ is a partial function, and it implicitly defines E, the edge set: (u, v) ∈ E iff τ(u, a) = v for some a ∈ ∆. Write u a→ v if τ(u, a) = v. The edge (u, v) is also called a pointer, written u→ v, and its tag is a. Each word w ∈ ∆∗ defines at most one path in G that starts at the origin and follows a sequence of edges whose tags appear in the order specified by w. Let [w]G (or [w] if G is understood) denote the last node in this path. If no such path exists, then [w] =↑. It is also useful to let w− denote the word where the last tag a in w is removed: so w−a = w. In case w = ε (empty word), then let w− denote ε. Thus, if w 6= ε then the last edge in the path is [w−]→ [w]. A node is accessible if there is a w such that [w] = u; otherwise, it is inaccessible. If we prune all inaccessible nodes and edges issuing from them, we get a reduced tagged graph. We distinguish tagged graphs only up to equivalence, defined as isomorphism on reduced tagged graphs. For any ∆-graph G, let G|w denote the ∆-graph that is identical to G except that the origin of G is replaced by [w]G. Let G∆ denote the set of all ∆-graphs. Pointer machines manipulate ∆-graphs. Thus, the ∆-graphs play the role of strings in Turing machines, and G∆ is the analogue of Σ∗ as the universal representation set. The key operation24 of pointer machines is the pointer assignment instruction: if w,w′ ∈ ∆∗, then the assignment w ← w′ modifies the current ∆-graph G by redirecting or creating a single pointer. This operation is defined iff [w−] and [w′] are both defined. We have two possibilities: if w = ε, then this amounts to changing the origin to [w′]. Else, if w = w−a then the pointer from [w−] with tag a will be redirected to point to [w′]. If there was no previous pointer with tag a, then this operation creates the new pointer [w−] a→ [w′]. If G′ denotes the ∆- graph after the assignment, we generally have the effect that [w]G′ = [w′]G. But this equation fails in general: e.g., let w = abaa and w′ = ab where [ε], [a], [ab] are three distinct nodes but [aba] = [ε] = [abb]. Assignment, plus three other instructions of the pointer machines, are summarized in rows (i)-(iv) of Table 1. A pointer machine, then, is a finite sequence of these four types of pointer instructions, possibly with labels. With suitable conventions for input, output, halting in state q↑ or q↓, etc, which the reader may readily supply (see [41] for details), we now see that each pointer machine computes a partial function f : G∆ G∆. It is easy to see that pointer machines can simulate Turing machines. The converse simulation is also possible. The merit of pointer machines lies in their naturalness in modeling combinatorial computing – in particular, it can directly represent graphs, in contrast to Turing machines that must “linearize” graphs into strings. 24The description here is a generalization of the one in [41], which also made the egregious error of describing the result of w ← w′ by the equation [w]G′ = [w′]G. 25 Type Name Instruction Effect (G is transformed to G′) (i) Pointer Assignment w ← w′ [w−]G a→ [w′]G holds in G′ where w−a = w′ (ii) Node Creation w ← new [w−]G a→ u holds in G′ where u is a new node (iii) Node Comparison if w ≡ w′ goto L G′ = G (iv) Halt and Output HALT(w) Output G′ = G|w (v) Value Comparison if (w  w′) goto L Branch if V alG(w)  V alG(w′) where  ∈ {=, <,≤} (vi) Value Assignment w := o(w1, . . . , wm) V alG′(w) = o(V alG(w1), . . . , V alG(wn)) where o ∈ Ω and w,wi ∈ ∆∗ Table 1: Instruction Set of Pointer Models Semi-numerical Problems and Real Pointer Machines. Real RAM’s and BSS-machines have the advantage of being natural for numerical and algebraic computation. We propose to marry these features with the combinatorial elegance of pointer machines. We extend tagged graphs to support real computation by associating a real val(u) ∈ R with each node u ∈ V . Thus a real ∆-graph is given by G = (V, s, τ, val). Let G∆(R) (or simply G(R)) denote the set of such real ∆-graphs. For a real pointer machine to manipulate such graphs, we augment the instruction set with two instructions, as specified by rows (v)-(vi) in Table 1. Instruction (v) compares the values of two nodes and branches accordingly; instruction (vi) applies an algebraic operation g to the values specified by nodes. The set of algebraic operations g comes from a set Ω of algebraic operations which we call the computational basis of the model. The simplest computational basis is Ω0 = {+,−,×} ∪ Z, resulting in nodes with only integer values. Each real pointer machine computes a partial function f : G(R) G(R) (11) For simplicity, we define25 a semi-numerical problem to also be a partial function of the form (11). The objects of computational geometry can be represented by real tagged graphs (see [42]). Thus, the problems of computational geometry can be regarded as semi-numerical problems. A semi-numerical problem (11) is Ω-solvable if there is a halting real pointer machine over the basis Ω that solves it. Another example of a semi-numerical problem is the evaluation function EvalΩ : Expr(Ω) R (cf. (4)) where the set Expr(Ω) of expressions is directly represented by tagged graphs. Real pointer machines constitute our idealized algebraic model for STEP A in our 2-stage scheme. Since real pointer machines are equivalent in power to real RAMs or BSS machines, the true merit of real pointer machines lies in their naturalness for capturing semi-numerical problems. For STEP B, the Turing model is adequate26 but not natural. For instance, numerical analysts do not think of their algorithms as pushing bits on a tape, but as manipulating higher-level objects such as numbers or matrices with appropriate representations. To provide a model closer to this view, we introduce numerical ∆-graphs which is similar to real ∆-graphs except that the value at each node is a base real from F. The instructions for modifying numerical tagged graphs are specified by rows (i)-(v) in Table 1, plus a modified row (vi). The modification is that each g ∈ Ω is replaced by a relative approximation g̃ which takes an extra precision argument (a value in F). So a numerical pointer machine N is defined by a sequence of these instructions; we assume a fixed convention for specifying a precision parameter p for such machines. N computes a partial function f̃ : G(F) × F G(F). Let X = A or R. We say that f̃ is an X-approximation of f if, for all G ∈ G(F) and p ∈ F, the graph f̃(G, p) (if defined) is a p-bit X-approximation of f(G) in this sense: their underlying reduced graphs are isomorphic, and each numerical value in f̃(G, p) is a p-bit X-approximation of the corresponding real value in f(G). We say f is X-approximable if there is a halting numerical machine that computes an X-approximation of f . This is the EGC notion of approximation. 25That is analogous to defining problems in discrete complexity to be a function f : Σ∗ Σ∗. So we side-step the issues of representation. 26We might also say that recursive functions are an adequate basis for semi-numerical problems. But it is even less natural. 26 Transfer Theorems. Let SNΩ denote the class of semi-numerical problems that are Ω-solvable by real pointer machines. For instance, EvalΩ ∈ SNΩ. Similarly, S̃NΩ is the class of semi-numerical problems that can be R-approximated by numerical pointer machines. (Note that we use the relative approximation here.) What is the relationship between these two classes? We reformulate a basic result from [41, Theorem 23]: Proposition 27. Let Ω be any set of real operators. Then SNΩ ⊆ S̃NΩ iff EvalΩ ∈ S̃NΩ. This can be viewed as a completeness result about EvalΩ, or a transfer theorem that tells when the transition from STEP A to STEP B in our 2-stage scheme has guaranteed success. In numerical computation, we have a “transfer process” that is widely used: suppose M is a real pointer machine that Ω-solves some semi-numerical problem f : G(R) G(R). Then we can define a numerical pointer machine M̃ that computes f̃ : G(F) × F G(F), where M̃ simply replaces each algebraic operation o(w1, . . . , wm) by its approximate counterpart õ(w1, . . . , wm, p) where p specifies the precision argument for f̃(G, p). In fact, it is often assumed in numerical analysis that STEP B consists of applying this transformation to the ideal algorithm M from STEP A. We can now formulate a basic question: under what conditions does limp f̃(G, p) = f(G) as p→∞? The framework in this section makes it clear that our investigation of real approximation is predicated upon two choices: base reals F, and computational basis Ω. Therefore, the set of “inaccessible reals” is not a fixed concept, but relative to these choices. When a real pointer machine uses primitive operations g, h ∈ Ω, we face the problem of approximating g(h(x)) in the numerical pointer machine that simulates it. Thus, it is no longer sufficient to only know how to approximate g at base reals, since h(x) may no longer be a base real even if x ∈ F. Indeed, function composition becomes our central focus. In the analytic and algebraic approaches, the composition of computable functions is computable. But closure under composition is not longer automatic for approximable functions. This fact might be initially unsettling, but we believe it confirms the centrality of the EvalΩ problem, which is about closure of composition in Ω. 7 Conclusion: Essential Duality Our main objective was to construct a suitable foundation for the EGC approach to real computation. Even- tually, we modified the analytic approach, and incorporated the algebraic approach into a larger synthesis. In this conclusion, we remark on a recurring theme involving the duality between the algebraic and analytic world views, and between the abstract and the concrete sets. The first idea in our approach is that we must use explicit computations. This follows Weihrauch’s [39] insistence that machines can only manipulate names, which must be interpreted. Our intrinsic approach to explicit sets formally justifies the direct discussion of abstract mathematical objects, without the encum- brance of representations. This has the same beneficial effect as our underbar-convention (Section 2). Now, interpreting names is just the flip-side of the coin that says mathematical objects must be represented. In Tarski’s theory of truth, we have an analogous situation of syntax and semantics. These live in complemen- tary worlds which must not be conflated if they are to each play their roles successfully. Thus, semantics in “explicit real computation” comes from the world of analysis where we can freely define and prove proper- ties of R without asking for their effectivity. Syntax comes from the world of representation elements and their manipulation under strong constraints. Interpretation, which connects these two worlds, comes from notations. A similar duality is reflected in our algebraic-numeric framework of Section 6: STEP A occurs in the ideal world of algebraic computation, STEP B takes place in the constructive world of numerical computation. A natural connection between them is the transfer process from ideal programs to implementable programs. The BCSS manifesto [5] argues cogently for having the ideal world. We fully agree, only adding that we must not forget the constructive complement to this ideal world. The second idea concerns how to build the constructive world for real computation: it is that we must not take the “obvious first step” of incorporating all real numbers. Any computational model that incorporates this uncountable set R must suffer major negative consequences: it may lead to non-realizability (as in the algebraic approach) or a weak theory (as in the analytic approach that handles only continuous functions). The restriction to continuous functions is unacceptable in our applications to computational geometry, where all the interesting geometric phenomena occurs at discontinuities. In any case, the corresponding complexity theory is necessarily distorted. We must not even try to embrace so large a set as the computable reals: the 27
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved