1、面向即时查询的分析级开源数据仓库(An Analytic Data Warehouse for Ad-hoc Queries)
(1)列存储、自动调谐(column-oriented data warehouse with automatically tuned)
a1、高压缩比,特别在内容的分析、决策支持查询(in the context of analytic,decision support querying)
a2、列存储适合分析级数据仓库,对列的小集合的选择访问,强调对数据的压缩(suitable for analytic data warehousing,with selective access to small subsets of columns and emphasis on data compression)
a3、行存储是OLTP系统的更好选择(a better choice for OLTP system)
(2)知识网格(Konwledge Grid-ultra small overhead metadata),极度小的头元数据-取代了传统索引(an alternative to classical indexes)
b1、查询的优化和执行,减少对数据的读和解压缩的需求(query optimization and execution,by minizing the need of data reads and data decompression)
b2、自动创建和高层次的数据的数据的使用(an idea of automatic creation and usage of higher-level data about data)
b3、知识网格的元素是知识结点(Knowledge Nodes),用来描述单个数据或行的相对大的部分(describle relatively large (conbinations of)portions of single data items or rows),即数据块(Data Packs),而不再是单个数据或行
b4、知识结点比标准的索引更小,以至能够更快的基于内存处理,更体现在较多类型的存储能力上(far smaller than standard indexes,which results in their faster,in-memory processing,as well as in the ability of storing more of their types),适合即时的带不可预测方法的查询的处理的增长需求(fits well with a growing need of dealing with ad-hoc,unpredictable ways of querying),帮助减少物理模型重新调整(helps to eliminate physical model retuning)
比如:SELECT MAX(X.D) FROM T JOIN X ON T.B = X.C WHERE T.A > 6。Pack-To-Pack产生T.B和X.C的DP之间的关系矩阵M。假设T.B的第一个DP和X.C的第一个DP之间有元素交叉,那么M[1,1]=1,否则M[1,1]=0。这样就有效地减少了join操作时DP的数量。